News BlockFin
  • bitcoinBitcoin(BTC)$100,947.00-3.61%
  • ethereumEthereum(ETH)$2,416.01-7.39%
  • tetherTether(USDT)$1.00-0.02%
  • rippleXRP(XRP)$2.08-5.50%
  • binancecoinBNB(BNB)$628.57-5.31%
  • solanaSolana(SOL)$142.55-7.68%
  • usd-coinUSDC(USDC)$1.000.01%
  • dogecoinDogecoin(DOGE)$0.170004-9.51%
  • tronTRON(TRX)$0.266843-2.31%
  • cardanoCardano(ADA)$0.62-6.49%
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • Metaverse
  • Web3
  • Analysis
  • Regulations
  • Scams
No Result
View All Result
News BlockFin
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • Metaverse
  • Web3
  • Analysis
  • Regulations
  • Scams
No Result
View All Result
News BlockFin
No Result
View All Result

OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ

Home Web3
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Receive, Manage & Grow Your Crypto Investments With Brighty

SolidityBench by IQ has launched as the primary leaderboard to judge LLMs in Solidity code era. Out there on Hugging Face, it introduces two progressive benchmarks, NaïveJudge and HumanEval for Solidity, designed to evaluate and rank the proficiency of AI fashions in producing sensible contract code.

Developed by IQ’s BrainDAO as a part of its forthcoming IQ Code suite, SolidityBench serves to refine their very own EVMind LLMs and examine them towards generalist and community-created fashions. IQ Code goals to supply AI fashions tailor-made for producing and auditing sensible contract code, addressing the rising want for safe and environment friendly blockchain purposes.

As IQ informed CryptoSlate, NaïveJudge provides a novel method by tasking LLMs with implementing sensible contracts primarily based on detailed specs derived from audited OpenZeppelin contracts. These contracts present a gold normal for correctness and effectivity. The generated code is evaluated towards a reference implementation utilizing standards comparable to purposeful completeness, adherence to Solidity finest practices and safety requirements, and optimization effectivity.

The analysis course of leverages superior LLMs, together with totally different variations of OpenAI’s GPT-4 and Claude 3.5 Sonnet as neutral code reviewers. They assess the code primarily based on rigorous standards, together with implementing all key functionalities, dealing with edge circumstances, error administration, correct syntax utilization, and total code construction and maintainability.

Optimization issues comparable to fuel effectivity and storage administration are additionally evaluated. Scores vary from 0 to 100, offering a complete evaluation throughout performance, safety, and effectivity, mirroring the complexities {of professional} sensible contract improvement.

Which AI fashions are finest for solidity sensible contract improvement?

Benchmarking outcomes confirmed that OpenAI’s GPT-4o mannequin achieved the very best total rating of 80.05, with a NaïveJudge rating of 72.18 and HumanEval for Solidity move charges of 80% at move@1 and 92% at move@3.

Apparently, newer reasoning fashions like OpenAI’s o1-preview and o1-mini had been crushed to the highest spot, scoring 77.61 and 75.08, respectively. Fashions from Anthropic and XAI, together with Claude 3.5 Sonnet and grok-2, demonstrated aggressive efficiency with total scores hovering round 74. Nvidia’s Llama-3.1-Nemotron-70B scored lowest within the high 10 at 52.54.

SolidityBench scores for LLMs (Hugging Face)
SolidityBench scores for LLMs (Hugging Face)

Per IQ, HumanEval for Solidity adapts OpenAI’s authentic HumanEval benchmark from Python to Solidity, encompassing 25 duties of various problem. Every process consists of corresponding assessments suitable with Hardhat, a well-liked Ethereum improvement surroundings, facilitating correct compilation and testing of generated code. The analysis metrics, move@1 and move@3, measure the mannequin’s success on preliminary makes an attempt and over a number of tries, providing insights into each precision and problem-solving capabilities.

Objectives of using AI fashions in sensible contract improvement

By introducing these benchmarks, SolidityBench seeks to advance AI-assisted sensible contract improvement. It encourages the creation of extra subtle and dependable AI fashions whereas offering builders and researchers with priceless insights into AI’s present capabilities and limitations in Solidity improvement.

The benchmarking toolkit goals to advance IQ Code’s EVMind LLMs and in addition units new requirements for AI-assisted sensible contract improvement throughout the blockchain ecosystem. The initiative hopes to handle a vital want within the trade, the place the demand for safe and environment friendly sensible contracts continues to develop.

Builders, researchers, and AI fanatics are invited to discover and contribute to SolidityBench, which goals to drive the continual refinement of AI fashions, promote finest practices, and advance decentralized purposes.

Go to the SolidityBench leaderboard on Hugging Face to be taught extra and start benchmarking Solidity era fashions.

🤖 High AI Crypto Belongings

View AllMentioned on this article



Source link

Tags: CodeContractGPTmodelOpenAIrankedSmartSoliditywriting
Previous Post

Land a Six-Figure Salary Job as a Blockchain Developer

Next Post

Bitcoin’s Perpetual Market Sees Slight Rebound As Market Sentiment Improves

News BlockFin

News BlockFin

Related Posts

Trump Media Gets Closer to Truth Social Bitcoin ETF Debut With SEC Filing
Web3

Trump Media Gets Closer to Truth Social Bitcoin ETF Debut With SEC Filing

June 5, 2025
Czech Government Faces No-Confidence Vote Over M Bitcoin Scandal
Web3

Czech Government Faces No-Confidence Vote Over $45M Bitcoin Scandal

June 4, 2025
How to Trick ChatGPT and Get Paid ,000
Web3

How to Trick ChatGPT and Get Paid $50,000

June 3, 2025
Best Short-Form AI Video Generator? Kling 2.1 vs Google Veo 3
Web3

Best Short-Form AI Video Generator? Kling 2.1 vs Google Veo 3

June 1, 2025
How smart EOAs are redefining the wallet experience
Web3

How smart EOAs are redefining the wallet experience

May 31, 2025
Nigel Farage Pledges to Slash Crypto Capital Gains, Force UK Bitcoin Reserve
Web3

Nigel Farage Pledges to Slash Crypto Capital Gains, Force UK Bitcoin Reserve

May 31, 2025
Next Post
Bitcoin’s Perpetual Market Sees Slight Rebound As Market Sentiment Improves

Bitcoin's Perpetual Market Sees Slight Rebound As Market Sentiment Improves

PayPal’s Move to Zero Fees for International Crypto Transfers

PayPal's Move to Zero Fees for International Crypto Transfers

A Billion-Dollar Purchase: Stripe Reportedly Buys Bridge

A Billion-Dollar Purchase: Stripe Reportedly Buys Bridge

Facebook Twitter Youtube Youtube RSS
News BlockFin

News BlockFin delivers the latest cryptocurrency and blockchain news, expert market analysis, and in-depth articles. Stay informed with round-the-clock updates and insights from the world of digital currencies.

CATEGORIES

  • Altcoin
  • Analysis
  • Bitcoin
  • Blockchain
  • Crypto Exchanges
  • Crypto Updates
  • DAO
  • Ethereum
  • Metaverse
  • NFT
  • Regulations
  • Scam Alert
  • Sustainability
  • Uncategorized
  • Web3

SITEMAP

  • About Us
  • Advertise With Us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact Us

Copyright © 2024 News BlockFin.
News BlockFin is not responsible for the content of external sites.

No Result
View All Result
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • Metaverse
  • Web3
  • Analysis
  • Regulations
  • Scams

Copyright © 2024 News BlockFin.
News BlockFin is not responsible for the content of external sites.