Training Data

All discussions tagged with this topic

Found 5 discussions
June 23, 2025

Discover why AI models frequently use em dashes in their writing, stemming from training data and auto-correction, and learn practical keyboard shortcuts to type them yourself.

June 23, 2025

Discover practical tips and creative analogies parents use to explain AI concepts, limitations, and ethics to their children, fostering critical thinking in the age of generative AI.

June 23, 2025

Users are observing AI models like ChatGPT and Gemini displaying 'thoughts' in non-English languages. This discussion explores why this happens, linking it to multilingual training, internal token efficiency, and research findings that suppressing it can even reduce performance.

June 18, 2025

A Hacker News discussion explores whether a programming language designed specifically for AI generation could improve code reliability by emphasizing explicitness, and how this interacts with LLM limitations, training data needs, and human usability.

June 14, 2025

Hacker News discusses the dream of creating a searchable Google Books and Sci-Hub from Anna's Archive, exploring technical challenges, legal nightmares, and the potential impact on research and AI.