Article 50 Transparency Looks Easy. The Engineering May Not Be.
· Updated
Update — 11 May 2026: The AI Act Omnibus deal reached at the May trilogue moves the 2 August 2026 deadline for Annex III high-risk obligations to 2 December 2027, pending formal adoption and publication in the Official Journal. Article 50 transparency moves to 2 November 2026. See: EU AI Act High-Risk Deadline Delayed to December 2027.
Article 50 is the article most teams think they can ship in an afternoon. Add an “AI” badge to the chatbot, write a sentence in the help centre, done. That covers the first paragraph. It doesn’t cover the obligation. The chat label is a UI element. The real obligation is structural: it applies on every channel where the system can talk to a person, it has to be machine-readable for synthetic content, and it has to reach people in physical spaces who never see a screen. Teams treating this as a copy task in spring 2026 will find by summer that the obligation lives in the system architecture, not on the marketing site.
The transparency requirements piece covers the four obligations at a survey level. This piece is about what they actually take to implement in a way that survives an inspection.
What Article 50 actually requires
Article 50 is the entirety of Chapter IV. Four distinct obligations, each with its own subject and its own bar.
Article 50(1) — Providers of AI systems “intended to interact directly with natural persons” must design and develop them so the people involved are informed they’re interacting with an AI, “unless this is obvious from the point of view of a natural person who is reasonably well-informed, observant and circumspect, taking into account the circumstances and the context of use.”
Article 50(2) — Providers of AI systems, “including AI systems that are general-purpose AI systems,” generating synthetic audio, image, video or text content, must ensure outputs are “marked in a machine-readable format and detectable as artificially generated or manipulated.” The technical solutions must be “effective, interoperable, robust and reliable as far as this is technically feasible.”
Article 50(3) — Deployers of emotion recognition or biometric categorisation systems must inform the people exposed and process the personal data in line with the GDPR.
Article 50(4) — Deployers of AI systems that generate or manipulate image, audio or video content “constituting a deep fake” must disclose that the content has been artificially generated or manipulated. For text published “with the purpose of informing the public on matters of public interest,” similar disclosure applies — with a carve-out where the content has undergone human review or editorial control and a natural or legal person holds editorial responsibility for the publication.
Article 50(5) — The information must be provided “in a clear and distinguishable manner at the latest at the time of the first interaction or exposure” and must meet applicable accessibility requirements.
All of Chapter IV applies from 2 August 2026, regardless of risk classification. Article 50 has no carve-out for “limited risk” simply because the system isn’t high-risk elsewhere. All four obligations are also unusually easy for a regulator, journalist, or competitor to verify externally. A mystery shopper, a single API call, or a walk past a shop window is enough.
”Directly interact” is wider than chat
The first thing teams underestimate is the scope of “interact directly with natural persons.” Customer-service chatbots are the obvious case. But:
- Voice agents, including AI phone systems and IVR replacements, are the same obligation in a harder modality. A voice greeting that sounds human is exactly the situation 50(1) was written for. The disclosure has to be audible, at the start of the call, in the same channel.
- AI features inside a larger product — a generative summariser inside an email client, a draft-reply tool inside a CRM, a writing assistant inside a CMS — interact directly with natural persons. The product isn’t an AI system. The feature is. The disclosure obligation runs with the feature.
- Embedded chat on third-party sites — widgets dropped into customer sites by a SaaS provider — make the provider responsible for disclosure even when the deployer controls the surrounding UI. You can’t ship a widget that looks human and rely on the deployer to add a label.
- Mixed human–AI flows — a chat that starts with a bot and hands off to a human, or vice versa — must disclose AI at the points where AI is operating. A flow that silently swaps a human reply for a generated one is exactly the deception the article exists to prevent.
The Article 50(5) timing rule is the trap. “First interaction or exposure” means in the channel the user is actually using, at the moment they start using it. Not a footer that appears two clicks in. Not a paragraph in the privacy notice. Not a banner the user has already dismissed.
The “obvious AI” exception is a trap
The 50(1) carve-out for obviousness reads as a permission slip and is almost never one in practice. The standard is “a reasonably well-informed, observant and circumspect natural person, taking into account the circumstances and the context of use.” That’s a higher bar than “a tech-literate user spotting a chatbot UI.”
A modern voice agent isn’t obviously AI. A chatbot using first-person empathy and conversational pacing isn’t obviously AI. A long-form drafted reply to a customer email isn’t obviously AI. The architectural assumption “everyone knows what a chatbot is now” isn’t what the standard says. The user is the test. The deployer doesn’t get to decide whether the disclosure was obvious enough.
There’s also a regulatory incentive problem. Disclosure is cheap. The exception is a legal argument. An auditor reviewing a chatbot with no disclosure has to be persuaded. An auditor reviewing one with a clear “AI assistant” label has nothing to argue about. Teams that ship without disclosure on the obviousness theory are choosing the path that produces a finding under uncertainty.
Synthetic content is a provenance problem
The 50(2) machine-readable obligation is where the engineering work hides. “Marked in a machine-readable format” is not a visible label. It’s metadata embedded in the file itself, so that the systems consuming the content — search engines, browsers, social platforms, fact-checkers — can detect it.
Two technical traditions are converging on this.
Content provenance metadata. The C2PA standard (Coalition for Content Provenance and Authenticity), backed by Adobe, Microsoft, the BBC, and others. Cryptographically signed manifests attached to images, video, and audio describing how the content was produced and modified. Resilient against benign re-encoding when pipelines are aware of it. Fragile against deliberate stripping.
Watermarking. Signal-level marks embedded in pixels, audio samples, or token distribution. SynthID, Meta’s invisible image watermarks, and a growing set of academic schemes for text. More resistant to stripping than metadata, but with known attack surfaces — cropping, compression, paraphrasing for text.
Article 50(2) doesn’t name a standard. It requires marking that’s “effective, interoperable, robust and reliable as far as this is technically feasible,” and Article 50(7) tasks the AI Office with encouraging codes of practice. In practice, if you’re shipping today you need both: C2PA-style manifests for the metadata layer and watermarking for the signal layer, because each covers a failure mode the other doesn’t. Harmonised standards under the EU process will eventually settle on a baseline, but providers shipping into the August 2026 window can’t wait for them.
For GPAI wrappers, the obligation flows through the provider chain. If you generate images by calling an external API — say, an OpenAI image endpoint — the API provider’s marking is the foundation, but you remain responsible for not stripping it as it passes through your pipeline. Re-encoding a marked image without preserving the manifest, or paraphrasing watermarked text through your own summariser, is precisely the failure mode the obligation is meant to cover.
Provider and deployer obligations are different
Article 50 splits responsibility along the same provider/deployer line as the rest of the Act, and the split confuses teams.
Providers of generative systems must mark the outputs (50(2)). The obligation sits with the entity placing the system on the market — even when their customers are the ones publishing the content.
Deployers publishing deepfakes must disclose that the content is AI-generated (50(4)). The obligation sits with the entity making the publication decision — even where the original provider already marked the file.
Both can apply to the same image. Picture this. Your marketing team uses a third-party image generator for a campaign asset. The image arrives marked by the provider (50(2) satisfied on their side). When your team posts the image on your company website, the deployer disclosure obligation kicks in (50(4)). The mark is for machines. The visible disclosure is for the audience. Both have to be there. This is the same provider/deployer distinction that catches teams under the provider trap, in a different domain.
The deepfake test isn’t “did we use AI” — it’s whether the content “appreciably resembles existing persons, objects, places, entities or events and would falsely appear to a person to be authentic or truthful.” A stylised illustration isn’t a deepfake. A photorealistic image of a real product, location, or person is. The closer the output is to documentary realism, the firmer the disclosure obligation. The artistic-and-satirical carve-out limits the obligation to disclosing the existence of the generated content “in an appropriate manner that does not hamper the display or enjoyment of the work.” It doesn’t eliminate it.
Emotion recognition and biometric categorisation
Article 50(3) lands hardest on operational deployments. The notification is owed in the physical world, often to people who never see a screen.
The scope is bounded by Article 5. Emotion recognition in workplaces and educational institutions is prohibited outright, with narrow exceptions for medical and safety purposes. Call-centre sentiment monitoring of employees, classroom attention tracking, retail-staff “engagement” analytics — these aren’t Article 50 disclosure problems. They’re Article 5 prohibitions. Notifying the employee doesn’t legalise the system.
What remains under 50(3): emotion recognition in customer-facing contexts (caller sentiment for service-quality routing, voice-stress analysis for fraud), biometric categorisation in physical spaces (retail demographic analytics, audience measurement), and similar systems where the affected people are users or visitors rather than workers or students.
The notification has to actually reach them. A footer in a privacy policy doesn’t. Signage at the entrance of a monitored space, an audible disclosure at the start of a call, a notice within the app where the categorisation happens — the question to design against is how does the affected person learn about this before the processing begins? Because the obligation is on the deployer, the design responsibility can’t be outsourced to the provider of the analytics tool.
The GDPR runs in parallel. Biometric data is special-category data under GDPR Article 9, which means a separate lawful basis is required regardless of the AI Act notification. An organisation that has solved Article 50(3) and not solved the GDPR question has solved one of two problems.
Documentation expectations
Article 50 doesn’t produce a single dossier the way Annex IV technical documentation does, but the records that survive an inspection are concrete:
- The disclosure design for each interactive system — channel, wording, placement, and the test that confirms the disclosure fires on first contact.
- The marking pipeline for any generative output — the format used (C2PA manifest, watermark scheme), the CI verification step confirming outputs carry the mark, and the policy for any re-encoding step that might strip it.
- The deployer disclosure decision for any synthetic content publication — who signed off, against what test of “deepfake,” with what visible mark on the published asset.
- The notification design for any emotion or biometric categorisation system — placement, wording, and evidence that affected people actually receive it (signage photos, call-opening recordings, in-app screenshots).
- The interaction with Article 26 deployer obligations where the deployer is also responsible for record-keeping and staff instructions.
A policy that asserts disclosure happens, with no evidence showing it does, has the same hole as the Article 12 logging problem in a different domain. The fix is the same: tests, in CI, that confirm the system behaves as the policy claims.
Common traps
The disclosure that disappears after the first message. A bot greets with “I am an AI assistant” and then everything looks human from the second message onward. The label has to persist. It’s a property of the channel itself, and the opening line alone doesn’t satisfy it.
The voice agent that sounds human and discloses on the website. The channel is the call. The disclosure has to be in the call.
Stripping provenance in the pipeline. Marked images go through a CDN that re-encodes them for thumbnails, losing the manifest. Marked text gets paraphrased by a summariser further down your stack, removing the watermark. The mark survives only as long as your pipeline preserves it.
Treating “obvious” as a default. If the disclosure costs nothing to add and the exception costs an audit argument, the cost-benefit always points one way.
Confusing the provider mark with the deployer disclosure. A C2PA manifest on a marketing image is the provider’s compliance. A line under the post saying “image generated with AI” is the deployer’s. Both are required. One doesn’t satisfy the other.
Notifying employees when the analytics are prohibited. Adding a sign to the call-centre wall doesn’t legalise emotion recognition on workers. The Article 5 prohibition isn’t a disclosure obligation.
Footer-only notices for biometric categorisation in physical spaces. The affected person is in the building. They’re not on the website.
Accessibility-blind disclosures. Article 50(5) requires conformity to applicable accessibility requirements. A visual-only label fails for a screen-reader user. An audio-only disclosure fails for a deaf user. The disclosure design has to clear the same accessibility bar as the rest of the product.
What to do now
If you’re a provider or deployer with AI in customer-facing channels:
- Audit every channel. List every place your system talks to a person — web chat, voice IVR, embedded widgets, mobile app, in-product features. For each, document the disclosure that fires at first interaction.
- Add disclosure to the channel layer. A label on the homepage doesn’t satisfy 50(1) for a voice call. Make disclosure a property of the conversation itself, in whatever modality the conversation is happening.
- Wire marking into the generation pipeline. Every synthetic output your system produces should carry a C2PA manifest, a watermark, or both. Add a CI test that fails the build if a sampled output comes through unmarked.
- Check your re-encoding paths. Anywhere an image is resized, a video is transcoded, or text is paraphrased later in your stack, confirm the mark survives. This is where 50(2) compliance quietly leaks.
- Map 50(3) notifications to physical spaces. If your deployment uses cameras, microphones, or sentiment analysis in a physical environment, design the notification for the people in that environment — signage, audible cues, app messages — not a privacy policy update buried two pages deep.
Article 50 is unusually easy to verify and unusually easy to underestimate. A regulator doesn’t need to subpoena documentation to find a violation — they call the support line, send a test message, look at a marketing image. Penalties for transparency breaches reach €15 million or 3% of global turnover, and the breaches themselves are the easiest in the Act for an inspector to evidence. The obligations look like UI work and turn into engineering work the moment the system has more than one channel, the moment generated content moves through any pipeline, the moment a notification has to reach a person who isn’t online.