Announced: Seahorse® is the T4U Successor

After the recent EIOPA announcement that the XBRL reporting tool T4U will be decommissioned next month, many filers are now looking for a quick solution to keep their submissions compliant.

At CoreFiling, it’s our business to keep you compliant – that’s why we are proud to announce that we are offering a free trial to our cloud-based regulatory filing platform, Seahorse®. The successor to T4U.

This free trial gives you the opportunity to create one complete filing to submit to a regulator – and even better, users will have three months to explore the software before submitting their filing. Here are just some of the ways in which Seahorse® can help your organisation:

The Benefits

  • Seahorse® lets you create fast, error-free XBRL filings. Unlike T4U, its data rendering is XBRL-based, so the reports you send will never have data conversion errors or approximations. The data is 100% accurate every time.
  • Seahorse® is hosted in the cloud. Its architecture lets you update taxonomies instantly, with no tedious installations. You can create and view your filings anywhere, any time.
  • Seahorse® allows you to easily create XBRL filings in the familiar environment of Microsoft Excel.

How do I sign up?

Trial access is available to anyone. To claim your trial, simply visit our website and fill out the sign up form.

Taxonomy Packages and the Table Linkbase

I’m pleased to report on some important steps forward regarding a couple of specifications that are close to our hearts.

On 27th March, EIOPA published the latest draft of the Solvency II taxonomy, making use of both the January Public Working Draft of the Table Linkbase Specification, and the Taxonomy Package Specification.

Moving to a recent PWD of the Table Linkbase specification is an important step for  the development of both the specification and the taxonomy, as it means that the taxonomy draft can benefit from improved tool support, and the specification from real world feedback.

Meanwhile, my colleague Jon Siddle continues to work tirelessly with the XII Rendering Working Group to complete the remaining work on the Table Linkbase Specification.  The latest edition of IBR magazine included an article by Jon explaining how the specification expands the boundaries of what can be achieved with XBRL (see p27 of the March edition)

It’s also been an important few days in the world of XBRL for Corporate Actions.  The final version of the 2012 Corporate Actions taxonomy was published on Monday as a Taxonomy Package.  Just a few days earlier, it was announced that Citi have started using the Corporate Actions taxonomy for filing dividend announcements to the Depository Trust and Clearing Corporate (DTCC).

New Public Working Draft of Table Linkbase Specification

XBRL International has announced the publication of a new Public Working Draft of the Table Linkbase specification.  This specification forms a key component of the Solvency II and CRD IV XBRL reporting projects.  This release is the first Public Working Draft since 2011, and represents a significant step forward in the maturity and quality of the specification.

Projects that have looked to adopt the Table Linkbase specification have been held back by a lack of recent public releases of the specification, creating interoperability problems as projects have adopted customised versions of the published schemas and standards.

The latest release of the specification has been driven forward by the efforts of CoreFiling staff, and in particular, Jon Siddle.  CoreFiling contributions have included the introduction of an XML serialised “infoset” for defining and testing the conformance of Table Linkbase processors, and the refactoring of the specification into three separate models (Definition, Structural and Rendering) to give a clear separation between syntax and semantics.

These improvements to the foundation of the specification will accelerate the development of the standard towards becoming an XBRL International Recommendation, and will help address the interoperability issues that have beset early adopters of the specification.

Financial reporting has sign conventions, XBRL has a rule

In my previous post, I looked at how a lack of clear best practice around the naming of concepts and elements has contributed to the confusion around sign conventions in XBRL.  I believe that another contributing factor is that the sign conventions used in financial statements are not trivial, and actually quite subtle.

Let’s take another look at the example from my first post:

Turnover 518 498
Cost of sales (321) (299)
Gross profit 197 199
Administrative expenses (211) (105)
Operating profit/(loss) (14) 94

You might also encounter a different presentation of exactly the same data:

Turnover 518 498
Cost of sales 321 299
Gross profit 197 199
Administrative expenses 211 105
Operating profit/(loss) (14) 94

Different jurisdictions seem to converge on one approach or the other, but the point is that either approach is valid.  The same is not true in XBRL.  When it comes to signs in XBRL, there’s a right way to do it, and a wrong way to do it.

In the above examples, we changed the sign of certain numbers on the statement, but we did not change the meaning.  If you change the sign of a number reported in XBRL you will always change the meaning.

When humans read financial statements, they use domain knowledge and context to correctly understand the figures.  I know that companies do not usually report a negative cost of sales (domain knowledge), and a quick check of the figures above and below (context) confirms that in neither case are the suppliers paying the company!

XBRL facts are designed to be understood independently, without the need for context or domain knowledge.

To illustrate the issue, imagine the accounts above had an additional line item:

Taxation 100 (50)

In one year the company paid tax, and in the other it received a tax credit, but which was which?  In the context of the first table, I’d expect this to represent a tax credit of 100 and a tax charge of 50, but in the context of the second table, I’d assume the opposite meaning.  Without the context, it’s completely ambiguous.

By contrast, the sign to be used in XBRL is completely prescribed.  Ask the question, “What was the taxation?” If you answer “100”, then tag a positive number.  If you answer “actually, there was a tax credit of 100” then tag a negative number.

In the last two posts we’ve seen that tagging a value with the correct sign in XBRL is easy, provided that:

  1. You can correctly understand the figure that you are looking at, and you may have to use context and domain knowledge to do this (even if you don’t realise that that is what you are doing); and
  2. The meaning of the concept (which includes its sign convention) is accurately captured in its name.

If you’ve been following XBRL for a while, you might be surprised that I’ve got this far with no mention of balance attributes.  We can’t avoid them forever, so in my next post I’ll be looking at whether they have anything to add, or if they merely contribute to the confusion.

Getting the sign right: Names, labels, and extensions

In my previous article, I demonstrated a simple technique for getting the correct sign when tagging a number in XBRL.  You may have noticed that I was somewhat casual with the notion of concepts having a “name”.  If you’re familiar with the details of XBRL, you’ll know that concepts have an “element name” and typically have at least one label.  Which of these was I referring to?

It is common practice in XBRL to use the standard label to give a concept a human readable name.  The purpose of a name is to unambiguously identify the meaning of a concept, and part of that meaning is the sign convention.  Making a profit and making a loss are two very different things, and if the name of the concept doesn’t make it clear which of these things the concept represents, then it’s not a very good name.

Examples of good names would include:

  • Profit
  • Profit/(Loss)
  • Increase in Accounts Receivable
  • Increase/(Decrease) in Accounts Receivable

Examples of bad names would include:

  • Profit or Loss
  • Change in Accounts Receivable

(the last one is border line – you might reasonably assume that a positive “change” is an increase, but it’s not explicit, and it’s not the sign convention that you’d expect to see used when displaying the concept on a Cash Flow statement)

A more unconventional name like “(Increase)/Decrease in Accounts Receivable” would also be acceptable but note that this is a different concept to one called “Increase/(Decrease) in Accounts Receivable”.

If the idea that a concept should have a name, and that that name should make it clear what the concept means is sounding a bit obvious, then good – it is obvious!

The problem with element names

A concept also needs to have an element name.  This serves a different purpose, which is to provide a unique identifier for the concept in an XML document.  Human readability is not the primary concern, although most implementations have chosen to use meaningful names (e.g. ProfitBeforeTax), rather than arbitrarily generated identifiers (e.g. “c1234”).

XML imposes some constraints on what constitutes a legal element name, most importantly disallowing spaces and most punctuation.  This means that we can’t simply use the standard label as an element name.  Most implementations have adopted an approach of taking the standard label, stripping out punctuation and removing some connective words such as “and”.  This approach is encouraged by FRTA, although an exact rule is not spelt out.

The approach has the unfortunate side effect of turning clear concept names (i.e. standard labels) into rather more ambiguous element names.  For example:

Concept Name         Element Name
Profit/(Loss) ProfitLoss
Increase/(Decrease) in Accounts Receivable IncreaseDecreaseInAccountsReceivable

Such names undermine the notion that XBRL concepts have a clear and unalterable meaning, and that that meaning includes the sign convention.  I suspect that elements such as the above have caused at least some of the confusion about how signs work in XBRL.

There is a very simple approach that would remove this confusion, but it’s not one that has made it into any published best practice that I am aware of, and that is to drop portions of the label that indicate the negated meaning when forming an element name.  For example:

Concept Name         Element Name
Profit/(Loss) Profit
Increase/(Decrease) in Accounts Receivable IncreaseInAccountsReceivable

If you’re uneasy about this approach, remember that the element is just a unique identifier.  It is not intended to be a descriptive label, so the fact that it does not spell out the meaning of a negative value is unimportant.

Extensions and the SEC

In my view, the confusion around signs in XBRL has been fuelled by a number of details of the implementation of XBRL at the SEC.  In the SEC implementation, preparers submit not only an instance document, but also an extension taxonomy allowing preparers to customise the taxonomy to better match their financial statements.

The SEC rule (33-9002) that enabled the use of XBRL for SEC Filings, requires filers to change the labels of standard concepts in the US GAAP taxonomy to match those on the company’s financial statements.  You can argue about whether that’s a good idea or not, but doing so opens the door to confusion around sign conventions.

The text of the rule gives the example of a company relabeling “Gross Profit” as “Gross Margin” as they are “definitionally the same”.  Seems harmless enough, but what about if the line item in your financial statements is “(Increase)/Decrease in Accounts Receivable”?  Should you change the standard label of the US-GAAP concept from “Increase/(Decrease) in Accounts Receivable” to “(Increase)/Decrease in Accounts Receivable”?  In my view doing so is absolutely unacceptable: an increase in accounts receivable is not the same as a decrease in accounts receivable, so changing the name of a concept in this way is very misleading.

The SEC system does provide an appropriate way to handle this situation (negating labels) but the guidance in the Edgar Filing Manual could be clearer.  Rule 6.11.1 instructs filers to “Assign a label of an element used in an instance the same text as the corresponding line item in the original HTML/ASCII document” but nowhere in this rule does it suggest that assigning a standard label that implies the opposite sign convention is unacceptable.  6.11.6 explains how to use negating labels, but does not explain what you should do with the standard label.

Proposed new best practice

I believe that much of the confusion around XBRL sign conventions could be removed by clearly documenting two pieces of best practice:

  1. The element name must reflect the meaning of a positive value reported against the concept.  If the element name is being formed from the standard label, parenthetical indications of negative meanings should be removed.  In other words, a concept called “Profit/(Loss)” should result in an element name of “Profit” not “ProfitLoss”.
  2. When using an extension taxonomy to re-label concepts, it is never acceptable for a standard label to change the meaning of a concept, and meaning includes sign.  For example, “Increase/(Decrease) in Accounts Receivable” must not be re-labelled as “(Increase)/Decrease in Accounts Receivable”.  The correct place for such a label is as a negated label.


Sign conventions in XBRL: it’s just not that difficult!

One of things has continued to surprise me with the adoption of XBRL is the amount of discussion that the question of tagging figures with the correct sign can generate.   Brendan Mullan recently managed to start no fewer than 12 separate threads on this topic on the xbrl-public list, some of which resulted in significant further discussion.

Marking up figures in electronic format is not a new phenomenon, and I’m not aware of any other domains that have managed to get so tangled up in sign issues.  What is it about the application of XBRL to financial reports that causes such difficulty?  I have some ideas, but first let’s look at how to do it right.

All you need to know to get the sign right

Let’s consider the following extract from a Profit and Loss statement:

Turnover 518 498
Cost of sales (321) (299)
Gross profit 197 199
Administrative expenses (211) (105)
Operating profit/(loss) (14) 94

There’s a really simple way to get the sign right in XBRL, every time.  Simply take the name of the concept that you’re using to tag the figure, and turn it into a question by prefixing it with “What was the… ”

For example, suppose our concept is called “Cost of Sales”.

Question: What was the Cost Of Sales?

Answer: £321,000

Even though the figure in the accounts is shown as “(321)”, you wouldn’t answer that question by saying “minus £321,000”, would you?  So we tag a positive number.

On the other hand, if in your answer you need to correct the question, then the sign should be negative:

Question: “What was the Operating Profit?

Answer: “Actually, there was a loss of £14,000.

Our answer is the opposite of the question that was asked, so we’d tag a negative number against a concept called “Operating Profit”.

Sometimes the concept name will be more explicit about the sign convention.  For example, you might have a concept called “Increase (Decrease) in Accounts Receivable”.  In this case, just ignore the bit in brackets, so your question becomes:

Question: “What was the Increase in Accounts Receivable?

If your answer starts, “actually there was a decrease…” then you should tag a negative number.  Otherwise, you should tag a positive number.

It really is that simple.  Nothing to do with balance attributes, negated labels, calculation weights or any of that stuff.

How did we end up in such a muddle?

There’s a number of reasons why what should have been a really straightforward issue has become confused into something much more complicated.  I’ll address these in a series of follow-up articles:



XBRL and very large instances

There has been a lot of discussion recently in the XBRL community about the use of XBRL for very large datasets. There are a number of misconceptions around about the practicalities of working with large instances, and some confusion about the extent to which different approaches to processing XBRL can improve performance. This article attempts to shine some light on the problem, and propose ways in which performance could be improved when working with large datasets.

XML: Productivity through inefficiency

When XML was gaining popularity at the turn of the century, there were many people who complained that it was an inherently inefficient way to work with data.  For anyone with experience of packing data into binary structures to minimise storage and memory usage, the idea of using XML tags around text representations of data seemed extremely wasteful.

The reality is that whilst XML is inherently inefficient relative to packed binary formats, or even CSV, computing power and memory usage had reached the point where, for most everyday data sets, the performance implications of this inefficiency were negligible, and were outweighed by the benefits of working with self-describing data that could be processed using standard validators and tools.

As computers have continued to evolve, the cut-off for how much data it is reasonable to handle using XML has increased.  For example, the core of many XML applications is the Document Object Model (DOM).  Memory requirements for DOM are of the order of ten times the size of the XML document.  In a world where computers with several gigabytes of RAM are commonplace, processing XML documents that are tens of megabytes in size has become feasible, but documents that are more than a few hundred megabytes in size remain problematic.

For such datasets, there are essentially two options:

  • Find more efficient ways of working with the XML data; or
  • Don’t use XML.

XBRL, being built on XML, suffers from the same inefficiency of representation, and the same challenges in processing.  In fact, in many cases, the problems are more acute as XBRL is not particularly efficient in its use of XML. This is particularly noticeable for heavily dimensional data, where each <context> element is only used by a small number of facts.

DOM is really inefficient, right?

As noted above, many processors are built around the Document Object Model (DOM), or some other DOM-like interface such as libxml or Python’s lxml1. The key feature of such interfaces is that the XML document is parsed into an in-memory representation allowing random access to all information that was in the XML. A “universal truth” that is often cited by people who know just enough to be dangerous is that “DOM is really inefficent“. Whilst it is true that the memory overhead of the DOM is significant, the question of whether it is an efficient way to solve a particular problem depends on the nature of the problem and what the alternative approaches are.

The standard alternative to a DOM-like approach is a stream-based approach such as SAX. SAX presents an XML document as a series of events, and it is up to the consuming application to extract the useful items of data as the events are received, and typically, store the extracted information in some in-memory representation.

The key to a stream-based approach being more efficient than a DOM-based approach is how much information you store in memory as a result of the parse, and the key to that is whether you can know in advance what subset of information you want from the XML document.

When working with an XBRL document, you generally don’t need all the information that’s in the XML model. What you want is an XBRL model. You don’t want to work in terms of elements and attributes, you want to work in terms facts, concepts, labels, dimensions, etc. In an ideal world you could SAX-parse your XBRL document straight into an XBRL model, and there would be no need for a DOM-style, in-memory representation of the XML.

Divorcing XBRL from XML

Unfortunately, we don’t live in an ideal world, and there a few ways in which XBRL clings unhelpfully to its underlying XML representation. The heart of the problem is that there is no well-defined information model for XBRL. There have been various efforts to create one, such as the XBRL Infoset, and more recently the Abstract Model, but none have yet come to fruition. The result of this is that there is no common agreement about which parts of an XML document are “significant” from an XBRL perspective, and which parts are irrelevant syntax-level detail.

A good example of where this creates a practical problem is XBRL Formula’s use of XPath as an expression language. Whilst the primary way of selecting facts for use in a formula is to assign them to variables using the fact selection constructs provided by the specification, XBRL Formula allows formula writers to include arbitrary XPath expressions. In other words, they can work not just with the XBRL, but with the underlying XML. Whilst this makes XBRL Formula very powerful, it means that an XBRL Formula processor is obliged to keep a copy of the XML document in memory in order to support the XML-based navigation required by XPath. In other words, if you want to use XBRL Formula, you’re pretty much stuck with the DOM, or something very much like it.

Another example is in the specification of validation rules. Here at CoreFiling, we’ve got a really nice XBRL model in the form of our True North API, and it makes writing XBRL validation rules really quick and easy. Unfortunately, validation requirements are often specified in terms of XML syntax rather than an XBRL model (this isn’t altogether surprising, given the above-mentioned absence of a commonly agreed XBRL model). A prime example of this is the Edgar Filer Manual, which defines validation criteria for SEC submissions. A quick read of the manual reveals rules specified in terms of XML artefacts such as elements and attributes, and not just XBRL artefacts like facts and concepts. The net result of this is that in order to implement many of these rules accurately, we need to dive behind our nice XBRL model and delve into a lower-level DOM-like model of the XML.

A Pure XBRL Processor

To summarise, in order to work with XBRL more efficiently and allow scaling to much larger instance documents, we need to work with it as XBRL, not XML. We need to introduce the notion of a “Pure XBRL Processor” which is free to discard irrelevant XML syntax.

In order to do this, we first need to define a commonly agreed XBRL model. We can then be clear about which problems can be solved with an efficient Pure XBRL Processor, and which are dependent on a processor with access to the underlying XML.

We then need to revisit technologies such as XBRL Formula and figure out how we can make them work with a Pure XBRL Processor. One option, of course, is to switch to an entirely different technology such as Sphinx which is already built on top of an XBRL model.

Another option is to restrict the XPath expressions that are allowed in XBRL Formula to a subset that can be implemented on top of a Pure XBRL Processor. In other words, retain the ability to access functions and variables, but remove the ability to do arbitrary navigation of the underlying XML document. This would be no bad thing. I spoke recently to XBRL Formula guru Herm Fischer, and he expressed his concern at the number of Formula rules he’d seen that use XPath expressions to navigate the XML model, rather than treating it as XBRL.

I’ve written previously about the risks of trying to treat XBRL as XML. Restricting XBRL Formula so that it can only work with the XBRL Model should lead to better written, more robust XBRL Formulas, and hopefully will guide rule writers away from concerning themselves with irrelevant syntactic details.

Of course, whilst a pure XBRL approach has the potential to use far less memory than one which must retain an XML model, ultimately any in-memory approach is going to have memory requirements that are proportional to document size, and so will always have an upper limit on the size of document that can reasonably be processed on any given hardware. For extremely large instance documents, more radically different approaches to processing will be necessary. Such approaches may well rule out the possibility of using familiar technologies such as Sphinx and Formula altogether. For such documents, moving to a pure XBRL approach is a necessary first step, but it’s not the whole solution.

I’m sure that these suggestions won’t appeal to everyone, but as XBRL moves into the enterprise, we need to free the information from the syntax used to represent it.

1. From this point on, I use the terms “DOM” and “DOM-like” to refer to any approach that stores an in-memory representation of the full XML model. Whilst it’s certainly possible to create DOM-like implementations that are more efficient than an actual DOM implementation, memory usage is still likely to be some multiple of the original document size and so will still suffer from the same fundamental performance limitations.

Reviewing financials with Magnify and Sphinx

Charlie Hoffman has added an interesting post to his blog about using Magnify to verify the integrity of a financial report.

Our Magnify XBRL review tool comes built in with a range of generally applicable XBRL quality checks, as well as some jurisdiction-specific filing rules, such as the Edgar Filer Manual and HMRC’s Joint Filing Common Validation Critieria rules, but as Charlie demonstrates, the real power of Magnify comes from the ability to drop in custom rules.

Magnify’s checklist view allows users to build a custom, structured review based on checks that can be implemented in a range of technologies. The fastest way to build rules that operate on the XBRL semantics of a report is Sphinx. We do also support the XBRL International Formula standard, but as Charlie notes, “creating Sphinx rules is much, much easier”.

Charlie’s published the source to the rules that he’s using. Although readable, they look a little bland in this plain text format. Sphinx rules are most easily developed using SpiderMonkey which provides a rules development environment with syntax highlighting, concept drag-and-drop, and on-the-fly syntax validation.

There are a few neat features to note in the rulebase. The first one is these few lines:

namespace ""
       to ""

namespace ""
       to ""

These two “transform” statements make all of the rules in the rulebase, which are written against the 2011 US GAAP taxonomy, also work with the 2009 US GAAP taxonomy. Once it’s published, two more lines will extended them to work with the 2012 taxonomy. Obviously this depends on the relevant concepts existing in both versions of the taxonomy, but where they don’t you can add some additional, more granular, transform statements to provide the necessary mappings. What’s more, if you happen to have an XBRL Versioning Report, you can easily generate the necessary transform statements.

Another thing to note about the rules is that they contain everything needed to generate the checklist that Charlie includes in the screenshot. Our validation platform is about more than just defining and executing validation rules. It’s about building a powerful and intuitive review environment:

Magnify screenshot outage – the need for Taxonomy Packages

For several hours this morning (UK time) the website was unavailable. You might think that this was of little consequence, until you realise that, consistent with XBRL best practice, HMRC’s guidance for company accounts requires that UK GAAP filings reference the UK GAAP taxonomy at its canonical location of using a <schemaRef> element. The XBRL 2.1 specification requires that XBRL processors resolve and discover the taxonomy documents referenced by such <schemaRef> elements. As such, out-of-the-box XBRL software following the rules of the specification couldn’t process UK GAAP instance documents during the outage this morning, and for anyone trying to use such software to create or review the accounts for their Corporation Tax return, this was a problem.

A similar issue existed for other UK taxonomies, such as UK-IFRS, and indeed, any of the many other taxonomies hosted on the website.

As noted in my earlier post, most XBRL software already has some mechanism for configuring local copies of taxonomies so that processing is not dependent on your internet connection or third party websites. Unfortunately, configuring such offline copies isn’t particularly easy. This is where taxonomy packages can help, as they contain all the information necessary to set up an offline copy of a particular taxonomy.

As XBRL becomes an important part of everyday business, ensuring that XBRL processes are implemented in a robust manner becomes essential. Taxonomy Packages can make doing that just a little bit easier.