Precise Proofs

Overview

One key use case for blockchains and, cryptography in general, is data integrity verification through the computation of hash values. Any modification of a hashed data item is easily detected, as the hash value computes differently after the change. In most cases, proofing possession and integrity of data to a verifier system requires the data to be revealed in its entirety. Naturally, this is a privacy concern. Precise proofs are an efficient cryptographic technique that can be used to prove knowledge of data by revealing only a small subset to the verifier system.

Application

Precise proofs can be used in situations where a light and cheap way of proofing knowledge of data is required, and security is such that a small subset of the data can be released safely. An example use case is demonstrating knowledge and possession of correct data to a third party, such as an auditor or regulator.

Precise proofs can be an alternative to zero-knowledge proofs, whenever a lightweight approach is favored, and strict zero-knowledge is not a requirement. Furthermore, the approach is useful when all parties involved have had exposure to the data at some point. It is also possible to ensure that two datasets are in sync while privacy is preserved.

Applied to Blockchain, one of the most useful applications of precise proofs is verifying that some off-chain data belongs to an on-chain representation. Thus, they can help to build systems that limit the amount of data stored on-chain while preserving integrity.

Precise proofs work best with data organized in the form of key-value pairs.

Functionality

Precise proofs are based on Merkle Trees. A Merkle tree (illustrated in Figure 1) is a data structure consisting of data items lined up at the leaves of a binary tree. These data items are hashed in pairs and the resulting hashes are also hashed in pairs recursively until a root hash is reached. Merkle trees have several interesting properties but mainly serve for integrity verification of the partial data stored at the leaves of the tree. A more detailed description of Merkle trees can be found here.   

Figure 1 - Merkle Tree Data Structure – Source: https://en.wikipedia.org/wiki/Merkle_tree#/media/File:Hash_Tree.svg


Precise proofs work by agreeing on a data schema, preferably consisting of key-value pairs. Each data item is stored at the leaves of a Merkle tree, the root of which is made public, for example by being stored on a blockchain. It is now possible to submit a proof of knowledge of the data by revealing as little as a single data item and submitting it together with the adjacent hashes that are required to calculate the root of the Merkle tree.

EnergyWeb Origin uses a custom implementation of precise proofs for certificates of origin. A so-called commitment, consisting of the Merkle root, the schema and the off-chain URL of a certificate, is stored on the blockchain. It is now possible to reveal some fields of a certificate to a third party, such as an electricity market regulator, in order to prove that it belongs to a specific on-chain commitment.

EnergyWeb Implementation

EnergyWeb provides a JavaScript implementation for precise proofs. This implementation presents some differences to an alternative open-source implementation by Centrifuge written in Go.

While Centrifuge's implementation has optimized support for hierarchical and nested data in the form of dot notation, which can be flattened and keyed, EnergyWeb’s implementation has improved security. In contrast to the Centrifuge solution, the data schema is included in the Merkle tree. This prevents key injection and duplicate key attacks, which would allow proving fake data.

On the other hand, the EnergyWeb implementation uses a plain JSON serialization format and does not provide a language independent and packed proof format, such as Centrifuge’s Protocol Buffers-based encoding.

Limitations

Precise proofs are not a universal solution to privacy-preserving data integrity verification. While the technique is very useful for the use cases outlined above and is lighter and cheaper than more complex cryptographic solutions, there are also some limitations:

  • A precise data schema needs to be agreed on by all participants in advance. The same is true for the hashing method.
  • The range of data formats that can be used is limited. Key-value pairs work best.
  • Some trust assumptions have to be made.  
  • The solution is interactive, in contrast to zero-knowledge proofs which in some case can be non-interactive.
  • Off-chain data storage is required, and a verifier tool needs to be employed.
  • The use cases are limited to those mentioned above and not as generic as zero-knowledge proofs.

So how to get started?

The proof-of-concept version of the Origin project's Precise Proofs implementation was published on our GitHub space: https://github.com/energywebfoundation/precise-proofs.

It is an npm package with 2 objectives:

  • To provide an easy-to-use tool for creating commitments, proofs and to verify them. Just "npm install" it for your project.
  • Demos & examples to show how it is used, and to demonstrate its capabilities and limitations.

Follow the README to get started with it.

Components and main steps

The document

In JSON format. You have to provide it as input. See an example below:












Leaves of the Merkle tree

As a second step, leaves are constructed by the tool based on the document that you provided as input.

Merkle tree, schema and the commitment

Then, the tree is constructed from the leaves.

The extended tree root and the schema is the commitment that will be published. Optionally, you can decide whether you want to include the extended tree root (meaning: with the schema) in you commitment, which is the recommended way for security.

The (extended) proof

After you have your commitment, you can start generating proofs. Right now it can only be done by revealing one key at a time.

The prover then sends the proof to the verifier (e.g. auditor).

Verification

The verifier can use this very same tool to check whether the proof adheres to the commitment.

The published commitment

There are no restrictions on how one wants to make the commitment public. Precise Proofs is chain agnostic and can be used on-chain / off-chain too. In our case, we write commitments to a contract that can be publicly read by anyone.

An simple example "registry of commitments" contract can be found in the repo too, and can be used by anyone for experimentation. It is deployed on 0x535ea027738590b1ad2521659f67fb25b08dd5ee on Tobalaba. Please do not use it for production, as it is just for demonstration purposes. Or at least use it with common sense. it was not thoroughly tested, nor audited, so we are not responsible.

PreciseProofCommitmentRegistry: 0x535ea027738590b1ad2521659f67fb25b08dd5ee
FunctionReturnsDescription
commitment(string _name, string _hash, string _schema) true/falsePublishes a new commitment. A commitment is identified by an arbitrary name, and the address of the "prover" who publishes the commitment. Returns true if succeeded.
getCommitment(address _by, string _name) merkle root, schema stringReturns the commitment identified by the address of the prover and the name.
checkCommitment(address _by, string _name, string _hash) true/falseChecks a root hash to the commitment. If there was a schema, it is assumed to be the extended root hash.

Demos

You can find the demo descriptions in the README. You can see examples for a lot of cases, also demonstrating key attacks. Feel free to experiment and open an issue/ pull request with anything interesting. It is an open-source project.

Acknowledgments

Special thanks to the Origin team, especially to Heiko who made the Precise Proofs poc.