There is much debate within this venue about the various merits and hazards of the TPP.
This diary is not about that. Instead, this is about the “meta” of the debate around the treaty. That is, I’m not approaching this about the function of the treaty, as an official copy, whether draft or otherwise, is not available. Instead, I am approaching this about the resources and processes used to evaluate a large, complex, technical document that can affect multiples of those beyond those who were involved with its negotiating and composition.
Note that while some portions of the TPP are available, and are used here for gathering “meta” information for complexity calculations, the actual impact of the TPP is not evaluated.
Note also that the approach to evaluating the complexity I will use here is to view the treaty similarly to a software project. This is an extensive computer science issue of itself. However, some of the basic concepts should be readily applicable to an ostensibly non-software document like a treaty.
Comparison: A Patent
One inspired moment, implemented as a clever widget or a few lines of software, leads to a one-page write-up that is presented to a patent lawyer. Some months (and several thousand dollars later) the Provisional Application is completed. This might be 10 pages long, or 100 times the original length of the concept. The Provisional Application is then reviewed by the Patent Office for another year, upon which they bring forth any objections or requested clarifications. If it isn’t rejected outright, it is updated and the final Patent Application is submitted, with then goes through another round of review.
This process takes about three years for a typical corporate patent. By the time of completion it has gone through several layers of review:
* The inventor
* Management of the inventor
* Corporate patent lawyer
* Prior Art search, commonly by a separate firm
* Review of the Prior Art
* Drafting and review of the Provisional Application, commonly by yet another separate firm
* Patent Office review of the Provisional Application
* Patent Office review of the Final Application
Advancing from one stage to the next is costly. There are resources to be hired, and opportunity costs of others. The multiple layers of review allows for earlier detection of problems.
Note that this covers only internal review processes of people all on the same “team”, and then the interaction between them and the Patent Office. No multiple, disparate groups trying to get their say; no coordination of translations between languages. It has a singular, well-defined purpose and scope. A patent is a much simpler type of document.
Comparison: International Technical Standards
A technical standard might be 100 pages long, and that might be one in a group of 10 interacting standards. Technical Standards come about because over the course of time there are various best-practices that have been developed. These are based in combinations of scientific research, engineering practice, production methods, product safety, environmental protection, and many other areas. A particular product may be subject to a wide range of such standards. In some cases the methods used to satisfy requirements of one standard conflict with those methods used to satisfy requirements in another standard.
A product is evaluated to a standard by an agency. These agencies are licensed by the government to perform the evaluation, and are also sanctioned by the organizations that produce the standards. In the case of product testing, the device is sent to an approval agency and tested to the standards. The actual implementation of the testing and review process is not necessarily obvious from the letter of how the standard is written. This process can take on the order of months, and several thousand dollars, all of which is very dependent on the complexity of the product.
Many modern technical standards have a way of becoming “harmonized”, that is, national standards are updated to match with the equivalent standards from other countries. This facilitates importing, exporting, cross-functionality, etc. The update process involves the relevant agencies and groups in each country, which are then coordinated with the equivalent international agencies and groups.
Who is “The Public”?
Within the question of whether to release the draft to the public, is just who makes up this “Public”. A casual definition is that it includes all those who live in the United States. However, another approach is to review the list of those involved with the negotiation and approval of the treaty.
* The U. S. House. Representatives cannot concertedly review the draft with their staff personnel, who are much more likely to be able to spend time concentrating on the document.
* The U. S. Senate. Same situation as the U. S. House.
* Other trade lawyers who worked on past agreements, officials from various government departments, etc.
* Other corporate lawyers and directors from corporations not represented in the negotiations.
* Professors who make a study of this subject along with others who study this area.
These are precisely the people who should be reviewing the drafts now, before the treaty is published in final form. They are the ones who can most efficiently identify deficiencies and problems in the document. Politicians have constituencies to represent, and others have expertise in the area.
“Black Box” vs. “White Box” testing
We’ll get to this a little more later, but for now let’s keep in mind that “Black Box” testing involves testing by only adjusting the inputs to a system and monitoring the outputs. One would use the user documentation of the “Black Box” to define if it works properly or not. One does not know the actual implementation within the “Black Box”.
In “White Box” testing, one does know how the system is implemented, and has full access to it. The same tests are run, but by knowing the actual implementation some tests can be more easily made to target specific features that may be undocumented.
Even a plain text system may effectively be a “Black Box” if it is complicated or obfuscated enough. This is where the importance of working papers and drafts comes in. Through the incremental changes to drafts, key change areas are easily identified. Areas with a lot of activity tend to be interesting. Working papers provide interpretation, or everyday language, of a topic before it is translated into stilted treaty language.
Each word and number is understandable. However, the concept is obscured in the plain text of the specialty treaty language. A fair amount of study to understand the context would be necessary to understand this passage. There will be several hundred such passages distributed throughout the entirety of the document – many of which will interact. By what process can this be understood? Reviewed? Debated?
How is complexity measured?
There are several ways to measure the complexity of a document. Here I will attempt to use the term, “document”, to refer to the entire treaty. I will use “chapter” to refer to a single chapter. Each of the chapters is subdivided into Articles, which then have paragraphs and optional clauses. I will apply the concept of Cyclomatic Complexity to how the evaluation of the document could be approached.
In effect, it measures the number of paths possible through a particular section of code, or in this case, an Article of the treaty document. To fully cover the impact of a policy, software, a contract, a treaty, etc., all interactions between the various sections need to be evaluated. The key is that paths through the entire document are tested, not isolated unit testing of each individual clause.
This measures structure. It does not evaluate content. It provides a relative metric by which the resources necessary for evaluation can be estimated, as in, one Article is 5 times more complex than some other Article, so it will need more time or people to determine what is going on.
Formulas
I adapted these formulas based on my interpretation of Cyclomatic Complexity and how it may best apply. If shortcuts were made, they were done in such a way to reduce the calculated complexity for the benefit of the TPP.
If P = “-“, then AX = 1.
If P = [1, 2, 3...] then PX = C + 1, and AX = PX(1) * PX(2) * PX(3)...
If C = “-“ then PX = 2, and AX = # of P * 2.
The complexity of the entire document is all the chapter complexities multiplied together. Likewise, the chapter complexity is all the Article Complexities multiplied together. This typically is not done with software, although it could be. If we only analyze individual Articles by themselves, not considering how they interact, the review process is of course made much simpler. While this is the approach I will take, it also opens up the possibility of missing serious issues with the overall document.
This is where “White Box” testing comes in. This allows us to select Articles that may be particularly egregious, suspicious, or ludicrous. We can then work from that point backwards and forwards to see what conditions get us to that situation.
For an illustration and validation of these calculations, here is a sample segment of software modeled after Chapter 12 Article II.2, along with the graph of the complexity:
Chapter 12 Article 2 Model
Chapter 12 Article 2 Flow
This seemingly simple function has a complexity of 24, that is, there are 24 paths through it. We have not evaluated if there are exclusionary paths due to the input values, but that is also not necessary, because it would take about the equivalent amount of work to generate that information. Yet compared to the Article it models, the source code seems truly simple indeed!
Note that from the article, it is recommended to limit complexity to the range of 10 to 15. 20 gets to be quite a difficult function, with 100 being fairly extreme for understandability unless there is some consistent, regular pattern.
What would an evaluation look like?
The first pass through a document allows one to check for spelling and not much else. This is how the reviewer gets up to speed with the terminology and structure of the document. There is very little time or ability to evaluate the applicability or impact of the document at this stage.
The second pass through a document may allow for some initial estimation of how some clauses behave, but this is still limited. The impact of one individual clause depends too much on the remainder of the situation and the other clauses that apply.
At some point a few specifics will have to be declared to test how the document can be used. These specifics could be chosen from current topical issues, past situations, etc. A number of these scenarios would have to be assembled. This alone is a complicated task, as the various areas of impact of the document would need to be determined, and permutations of those impacts gathered together in order to have fairly robust coverage across a wide range of cases.
It’s likely that some scenarios would start with a questionable clause and attempt to work backwards to construct a situation to achieve that path.
With some scenarios defined, the actions allowed by the document could then be simulated.
For the fun of it here is Chapter 12 Article II.28, with a complexity of 23040:
Chapter 12 Article 28 Model
Chapter 12 Article 28 Flow
Now what?
Up until this point was the easy part of the evaluation. A fair amount of the (60 day?) evaluation time would have been used up. There isn’t enough time left over to properly run the simulations. The simulations stress and exercise the document to bring out procedural flaws (or features, as the case may be) in order to properly characterize the entirety of the effect of the document. The modern case of “
Here be dragons” is “
Here be loopholes”.
Let’s put it this way: one simulation run would be a nice activity for any of:
* A doctoral thesis
* A semester-long political science or economics group project in class
* Your local friendly think-tank
* A group of concerned citizens over an extended amount of time
The number of simulation runs necessary to have a probability of identifying problems is related to the Cyclomatic complexity. Now we can see where this is going! With a large document, the complexity gets so big that it becomes impossible to characterize it within a limited time with a limited resource budget. Without coordination between the groups that run the simulations there is no systematic way to make sure the coverage is enough to likely identify issues.
Extra credit: more than one scenario may apply simultaneously!
Conclusion
A set of procedures is defined within a document. These procedures depend upon each other. They are implemented within some regime to operate on data that is necessarily not defined at the time the procedures were written.
To test the procedures before the document is implemented a number of test methods are applied. Some of these work on the static definitions within the document; others simulate situations with test data, upon which the document is stressed. Results are gathered and analyzed.
Whether this is a million lines of software or a hundred pages of international treaty the approach can be similar.
A few exercises quickly point out the absurdity of allowing those who are entrusted to vote on Fast Track, and eventually on TPP, to have such narrow access to the document without being able to take notes: it is effectively impossible for them to ever get to the point of understanding even a single chapter. It also points out the absurdity of proceeding to Fast Track before an evaluation is made of the draft text: there is no chance to find a solution for identifiable flaws.
Is there any other work of such scale and potential impact where the responsibility of determining exactly what it is capable of or what its flaws are so thoroughly abrogated?
Without the benefit of sustained investigation and analysis by political staff, the lone politician who wanders into the vault where the document is kept under lock and key, guarded and monitored by minders, is at a complete disadvantage compared to the legion of negotiators who can discuss, review, craft, and intricately architect the document to suit their needs.