Show HN: I Made a Hot or Not Benchmark for AI Design
Summary
A team created a "Hot or Not" style benchmark game to evaluate and rank AI-generated frontend designs, revealing significant variability in quality across models and categories. Their findings highlight that while some models like DeepSeek and Grok excel in certain areas, others such as OpenAI's models perform inconsistently, especially outside game development. This crowdsourced approach provides valuable insights into the strengths and weaknesses of current AI design capabilities, underlining both impressive progress and ongoing limitations in the field.