Berkeley Defi Research Initiative

0x Intelligent Forensic Competition

The emergence of blockchain technologies and real-world defi applications has contributed to rampant occurrence of security fraud and scam activities online. Due to the decentralized nature of many public, permissionless blockchains, regular users lack means to effectively verify the trustworthiness of counterparties when conducting blockchain transactions. Indeed, users of many social media are frequently being bombarded by such scam messages and even well-coordinated fraud or rug-pull projects. Making situations worse, when a fraud occurs, permissionless blockchains are not able to proactively stop the perpetrators and recover the lost funds for the victims.

Notwithstanding these critical security weaknesses that could make regular consumers vulnerable, public, permissionless blockchains also are fully transparent that openly and irreversibly document all the transactions made by all parties. Therefore, we believe more intelligent security monitoring and alert solutions can be developed to proactively warn the regular consumers about the potential security risks. This is the core purpose of the 0x Intelligent Forensic Competition.

Rules of the Competition:

The Organizer will provide a database of all transaction history retrieved from Ethereum (ETH) blockchain from its genesis block to the last block in 2020, referred to as the Training Database. In addition, a list of known trusted ETH wallet addresses and a list of known scam ETH wallet addresses based on public records on the Internet will also be aggregated by the Organizer from the Training Database, referred to as Training Labels. Please note that the list of trusted addresses and the list of scam addresses are not complete. The Organizer will provide reasonable effort to ensure high True Positive and high True Negative rates. One challenge of the competition is that the vast majority of the ETH addresses do not have groundtruth labels to be guaranteed trusted or scam.

Participants of the 0x competition may train their classifiers to provide an estimated label of a query ETH address to be between 0 and 1, 0 being no risk and 1 being high risk. The classifiers can be of any method, as long as the submission code is written in Python. The ranking of all participating solutions will be determined by two batches of testing data:

ETH Test: A random period of ETH transaction history post 2020 will be used to form a part of the Test Benchmark
Wild Card Test: Another undisclosed blockchain will also be used as the second part of the Test Benchmark to validate the accuracy of submitted solutions.

For the Test Benchmark, all real blockchain addresses will be anonymized but a one-to-one correspondence will be ensured. To discourage any participant to “game” the system, we strongly discourage any submission to try to overfit their algorithm using ETH onchain data post 2020, which is also the chief reason that the Organizer is proposing a Wild Card Test based on another undisclosed blockchain and its transaction history. Furthermore, to win the more challenging Wild Card Test, submission solutions are advised to discount using signature transaction features specifically associated with the ETH blockchain, such as its gas fee values.

The release of the Training Database will include a static database file about 500GB, sample scripts to efficiently retrieve transaction data in Python, and an installation README file.

Important Dates:

Training Database Release: TBD
Competition Submission Deadline: TBD

337 Cory Hall,
Berkeley, CA 94720

Allen Yang, Executive Director yang@eecs.berkeley.edu
Huo Chao Kuan, Administrative Officer hc_kuan@berkeley.edu