Unveiling AI's Reading Material: Exploring the Inner Workings of Generative Citation Processes
A new study, "What is AI Reading" by Generative Pulse, has revealed that AI models like OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude tend to cite different types of sources based on their design and industry context, with notable variations across the models and prompt types.
Distinct Citation Preferences
The study analyzed over 1 million citations from major AI systems and found that each model favours distinct sources.
- Claude favours authoritative, well-sourced content, including government sources and technical documents, supporting thought leadership and expert positioning strategies. Claude also has transparent citation behaviours, often linking to academically credible or official sources.
- ChatGPT tends to cite more from mainstream news and journalistic outlets, valuing current, credible third-party earned media sources. It performs 1–3 parallel web searches contextually and cites sparingly but inclusively depending on the prompt, often drawing from recent news articles and blogs.
- Gemini, while less publicly documented regarding citation behaviour, leans towards democratizing knowledge by regularly citing more community or educational content platforms like Wikipedia, Coursera, Quora, and YouTube.
Citation Patterns Across Industries and Prompt Types
The study found that AI citation favours earned media (credible, third-party content) over paid or corporate promotional content. About 95% of AI citations are non-paid media, with 27% from journalistic outlets and 37% from independent corporate blogs.
Citation patterns reflect industry needs. Technical or expert-driven industries (e.g., government, science) see more citations from official, academic, or technical documents, particularly by Claude. Consumer-facing or media-driven sectors experience more citation from recent news, blogs, and accessible educational platforms, favoured by ChatGPT and Gemini.
Citation structures emphasize content that is quotable, clear, and semantically well-matched to the prompt rather than being optimized only for traditional SEO. This means AI models prioritize relevance and clarity over standard SEO metrics.
Summary of Citation Tendencies
| AI Model | Favoured Sources | Citation Style | Industry Use Cases | |-------------------|----------------------------------------------------------|----------------------------------|------------------------------------| | Claude | Government sources, technical docs, academic content | Transparent, authoritative | Technical, academic, governmental | | ChatGPT | Mainstream news, journalism, third-party blogs | Sparse but contextual citations | Consumer news, media, general info | | Gemini | Wikipedia, Coursera, Quora, YouTube | Democratized, wide knowledge base| Educational, general knowledge |
In conclusion, AI models exhibit clear distinctions in citation preferences shaped by their architecture and intended application context. These differences can significantly impact the type and quality of content that users encounter online. Being cited by AI can put your content in front of users who may never visit your site but trust the model referencing it.
Data-and-cloud-computing technology plays a crucial role in supporting AI models like OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude as they perform thousands of web searches and cite multiple sources to generate responses.
Artificial-intelligence models, particularly ChatGPT and Gemini, are found to rely on distinct sources for their citations, with ChatGPT preferring mainstream news and journalistic outlets, and Gemini favoring community or educational content platforms like Wikipedia and Coursera.