Try something. Open ChatGPT or Perplexity and ask it to summarise your company’s most recent quarterly results. Then compare what comes back against what your IR team actually published.
If you’re lucky, the AI will cite your own earnings release. If you’re unlucky (and this is the more common outcome) it will cobble together an answer from a broker note, a news aggregator, and a two-year-old blog post that got one of your segment numbers wrong. Your company’s story, told by someone else, with errors you can’t correct after the fact.
This isn’t a hypothetical future problem; AI crawler traffic to corporate websites rose 18% year-on-year in 2025, with OpenAI’s GPTBot alone surging over 300%. Your IR website may already be read more frequently by machines than by humans.
The investment industry has moved in parallel. A 2025 AIMA survey found that 95% of alternative investment / hedge fund managers now use generative AI in their work. Equity research firms routinely employ automated scraping to pull financial and ESG data directly from corporate websites. The audience for your IR content has fundamentally changed; most IR websites aren’t remotely ready for it.
Why most IR websites fail their machine audience
If you’re a CFO, you don’t need to understand the technical plumbing in detail. But you do need to understand where the problem sits, because the consequences land squarely in your domain: valuation, cost of capital, analyst coverage, and the accuracy of your company’s public narrative.
Here’s what’s going wrong at most listed companies right now.
Your financial data is locked in PDFs. Annual reports, quarterly filings, sustainability disclosures. The vast majority still live in PDF format. Humans can read them. AI can’t reliably parse them, especially tables, footnotes, and anything involving layout. When a machine encounters your 200-page annual report as a PDF, it’s essentially trying to read a photograph. Key figures get missed. Context gets stripped. An EY survey of chief accounting officers at major US companies found that only a quarter held their ESG data in dedicated reporting software — over half were still using spreadsheets. If the data isn’t structured internally, it’s certainly not structured for the machines trying to read it from outside.
Your website was designed for eyeballs, not crawlers. JavaScript-heavy pages that load content dynamically are a particular culprit. Many AI crawlers don’t execute JavaScript. The content may render perfectly in a browser and be completely invisible to a bot. If your IR section relies on JavaScript widgets to display stock data or financial highlights without a plain-text fallback, that information doesn’t exist as far as AI is concerned.
Then there’s the consistency problem. The revenue figure in the press release doesn’t quite match the figure on the IR website because someone rounded differently. The segment names changed two quarters ago but the old nomenclature is still floating around in a prior-year PDF. AI systems cross-reference sources. They’ll notice these discrepancies, and when they do, they’re more likely to pull from a third party that at least appears internally consistent than from your own site.
Half the visitors you actually want can’t get in. CAPTCHAs and aggressive IP blocking were designed to stop malicious scraping. In practice, they also block equity research crawlers, ESG rating agencies, and the very AI systems you want reading your data. But the barriers aren’t only technical. Financial information is the number one reason investors visit corporate websites, according to Bowen Craggs visitor research, and also the number one source of frustration when they can’t find it. A Nielsen Norman Group usability study across 94 corporate IR sites put it more bluntly: professional investors said they wouldn’t rely on a company’s own website for most financial data at all, defaulting to Bloomberg and Reuters instead. If human investors are already leaving your site for better-structured alternatives, AI systems will do the same, faster and without telling you.
If you are not actively ensuring your own authoritative data is the easiest thing for these systems to find and ingest, someone else’s version of your story becomes the default.
You are not managing what the machines already know about you. This is the one that keeps me up at night, professionally speaking. AI models carry whatever they ingested during their last training cycle: that unflattering opinion piece from 18 months ago, that analyst note with the incorrect target price, that blog post that speculated about your strategy and got it wrong. If you aren’t actively ensuring your own authoritative data is the easiest thing for these systems to find and ingest, someone else’s version of your story becomes the default.
What to do about it
The good news is that this is a tractable problem, and you don’t need to solve all of it at once. The less good news is that nobody else is going to solve it for you.
Three things this quarter
Basic hygiene
- Check your robots.txt file. Ask your web team whether GPTBot, ClaudeBot, and PerplexityBot are explicitly permitted to crawl your IR section. If nobody knows the answer, that is the answer. Many corporate websites block AI crawlers by default. You want to allow the legitimate ones in while rate-limiting the aggressive ones. This is a policy decision, and it takes about an hour.
- Get your key content out of PDF jail. Start with the pages that matter most for how AI represents your company: the earnings release summary, your company profile, your leadership team, and your dividend policy. Publish these as clean HTML pages with clear headings. A machine that can read “Q4 2025 Results Summary” followed by your revenue figure in plain text will get your numbers right. A machine trying to extract the same from a scanned PDF table might not.
- Add structured data markup. Schema.org lets you tag your company profile as an “Organization,” your press releases as “NewsArticle” items, your earnings calls as “Event” entries. Think of it as nutrition labels for corporate information. You’re not changing what you disclose. You’re making what you already disclose readable in the format machines expect. Your web team can implement this in a matter of weeks.
Want us to check your crawl visibility? Request a free website diagnostic →
Three for next year
Competitive advantage
- Build an analyst FAQ page. What questions do analysts actually ask you? Not “how do I contact your transfer agent” but “what is your dividend policy?” and “what were your key results this quarter?” Put the question in the heading, the direct answer in the first sentence. This is exactly the format generative AI tools are designed to extract and cite. If you build it, they will quote you rather than someone else.
- Make your data downloadable. Historical financials as Excel or CSV, alongside the narrative PDF. Simple, powerful, and it signals seriousness about data quality. In our experience, companies that have done this see meaningful reductions in ad-hoc analyst requests for basic data. The same pattern shows up next door: in ESG reporting, automating the collection of data that previously lived in spreadsheets and PDFs has cut manual reporting time by up to 60%. Structured data that machines can read without human intervention turns out to reduce the volume of questions humans have to answer.
- The longer-term play: an API. A controlled data tap that lets authorised systems fetch datasets from your IR platform programmatically, without scraping your website. This is the gold standard for serving the analyst platforms and data aggregators that form the backbone of how the market sees your company. Real development work, yes. But the case studies are persuasive: when authorised systems can pull structured data directly rather than scraping websites, the volume of manual data requests drops sharply. What follows is less obvious but more valuable: better ESG ratings, new analyst initiations, and the kind of quiet reputational lift that comes from being the company whose data just works.
The cost of not doing this
There is a version of this article that frames everything above as an exciting opportunity. I considered writing it. The data supports the case: studies have shown that companies with superior digital disclosure practices trade at a valuation premium relative to peers.
But let me be direct about the risk side, because that’s what should actually get this onto your agenda. If your financial and ESG data isn’t structured for machine consumption, it will be either overlooked or misrepresented by the automated systems that now mediate how institutional investors form their initial views of your company. That is not a technology risk. It is a valuation risk, a coverage risk, and a governance risk.
The regulatory environment is pushing in the same direction. The SEC already mandates Inline XBRL (machine-readable tagging embedded directly in financial filings) for financial statements. The EU’s Corporate Sustainability Reporting Directive requires digital tagging of sustainability information. The ISSB, the body writing the global baseline for sustainability disclosure, is building that baseline digital-first. These mandates are coming regardless, and the foundational work they require overlaps substantially with what an AI-ready IR website needs.
So the question isn’t really whether to do this. It’s whether you do it proactively, on your own timeline, with the chance to gain a competitive advantage — or reactively, on a regulator’s timeline, at which point your peers have already moved.
How to know it’s working
If you fund this, you’ll want to know whether it’s delivering results. Traditional IR website metrics (page views, report downloads, time on site) are still relevant but no longer sufficient.
Run a quarterly “AI citation audit”: ask the major AI platforms questions about your company and check whether the answers come from your own IR site or from third-party sources. Monitor whether your data on Bloomberg, Refinitiv, and FactSet matches what you have published. Track whether your pages appear in Google’s AI overviews. Count the reduction in ad-hoc data requests. These are the leading indicators of whether you’re winning or losing the battle for control of your own narrative in an AI-mediated market.
If you sit in the CFO’s chair, this one is yours.
Frequently asked questions
Why can’t AI tools find my company’s financial data?
Most corporate IR websites store financial data in PDF format, which AI systems cannot reliably parse — especially tables, footnotes, and complex layouts. JavaScript-heavy pages that load content dynamically are another common culprit, since many AI crawlers don’t execute JavaScript. If your key figures aren’t published as clean HTML with clear headings, they may be invisible to the machines that now drive investment research.
How do I check if AI crawlers are blocked from my IR website?
Ask your web team to review your robots.txt file and check whether GPTBot, ClaudeBot, and PerplexityBot are explicitly permitted to crawl your IR section. Many corporate websites block AI crawlers by default. You want to allow legitimate ones in while rate-limiting aggressive scrapers — a policy decision that typically takes about an hour to implement.
What is structured data markup for investor relations?
Structured data markup uses the Schema.org vocabulary to tag corporate information in a format machines can read. You can label your company profile as an “Organization,” press releases as “NewsArticle” items, and earnings calls as “Event” entries. It doesn’t change what you disclose; it makes what you already disclose readable in the format AI systems and search engines expect.
How do I measure whether my IR website is working for AI?
Run a quarterly “AI citation audit”: ask the major AI platforms questions about your company and check whether the answers cite your own IR site or third-party sources. Also monitor whether your data on Bloomberg, Refinitiv, and FactSet matches what you have published, and track whether your pages appear in Google’s AI overviews.
Advising listed companies representing over $50 billion in aggregate market capitalisation.
Want to find out how AI systems currently see your company — and what to change?
Request a Website Review