FREE UK Companies Open Data FREE UK Companies Open Data Contact us Contact Us Twitter LinkedIn

Conquering XBRL syntax with TNT

March 31, 2010 15:46 by

The benefits of XBRL as a standard electronic format for the exchange of financial information are enormous. Electronic analysis of traditional paper reports requires expensive, error-prone re-keying.  PDF and HTML reports are only slightly more accessible; data extraction is ad-hoc and imperfect.  In contrast, XBRL reports are structured and machine-readable.  If you can process one, you can process them all.

XBRL instance documents carry the data: a set of dimensionally qualified facts. Each monetary value is tied to an ISO currency code, and every fact is rigorously defined by a concept in a supporting taxonomy.  The taxonomy defines the reporting vocabulary and provides a wealth of metadata, which supports validation, guides interpretation, and facilitates comparison.  The instance document tells you that Apple’s total current assets on the 26th of September 2009 stood at 31.555 billion US dollars.  The taxonomy tells you which other figures contribute to that total.

The flexible nature of financial reporting is addressed in XBRL through taxonomy extensibility, which allows users to augment and customise taxonomy metadata.  While this provides significant power, it means that consuming taxonomy metadata is very much a runtime consideration rather than something that can be tackled once at dev-time.  Amazon’s income statement contains a custom concept for “Technology and content” expenses.  Google’s balance sheet includes a custom concept for the “Total number of shares of Class A common stock subject to repurchase”.  This flexibility can be frustrating for technologists, but it’s an inherent and necessary feature of financial reporting.  XBRL is like a box of chocolates; you never know what you’re going to get.

Unfortunately, XBRL’s semantic power is matched by its syntactic complexity. Data and metadata is spread across multiple files, and though XBRL is XML, it is not conventionally structured, so it is not amenable to processing using traditional XML tools such as XPath, XSLT, XQuery, and Schematron.

At the XBRL instance level there is very little hierarchy.  You have a flat set of facts, with dimensional information split out to xbrli:context elements, which sit at the same level as the facts.

At the XBRL taxonomy level, hierarchies are represented not as XML trees, but using XLink linkbases, which are complicated and expensive to process.

To assemble a complete and accurate picture of an XBRL report, specialist processing code is essential.  Libraries such as our True North XBRL Processor can efficiently compile XBRL into a set of objects, accessible through Java or .NET.  Unfortunately, this isn’t much consolation to an XSLT-savvy analyst, or the architect responsible for an XML database. XBRL is XML, but for common analysis and transformation tasks it may as well be binary.

There are two ways to address this problem:

The first is to define a set of XPath extension functions, backed by an XBRL Processor.  The main obstacle here is that the extension functions must be integrated with every XML application in your processing chain.  A secondary, though potentially critical consideration is that the cost of XBRL processing will be paid repeatedly, in each system that has been XBRL enabled.  Finally, there is the impact on the XPath expressions themselves.  The XBRL tree structures are exposed, but not as an axis, so you lose the elegant path-based navigation that makes XPath so compelling.

The second, fundamentally different approach is to transform the XBRL into a format that’s easier to manage.

Our Composite XBRL representation brings all of the XBRL data and metadata together in a single document as traditional, hierarchical XML.  XPath once more comes into its own. Operations that were practically impossible to express against the source files, and cumbersome to express with extension functions, become trivial and natural.  The cost of XBRL processing is paid once, up front, and downstream processing can be handled by vanilla XML tools, without the need for extension libraries.  The impact of XBRL on your processing chain is minimised and localised.

We believe this approach has huge potential, and to support it we have produced TNT: the True North Transformer.  Contact us for further information.

In: Interactive Data, XBRL Tech | Both comments and pings are currently closed. | Permanent Link

Comments are closed.