Mr. Chatterbox is a Victorian-era ethically trained model

(simonwillison.net)

17 points | by y1n0 3 hours ago

2 comments

  • lovelearning 0 minutes ago
    I thought the title meant the training data used was ethics content and ethical reasoning. Turns out "ethically trained" means the training data used doesn't violate copyright laws.
  • parpfish 28 minutes ago
    after testing, i'm pretty sure that either a) i dont understand Victorian speech very well or b) a model with 340million parameters doesn't generate particularly coherent speech
    • starkparker 22 minutes ago
      b: "The 2022 Chinchilla paper suggests a ratio of 20x the parameter count to training tokens. For a 340m model that would suggest around 7 billion tokens, more than twice the British Library corpus used here. The smallest Qwen 3.5 model is 600m parameters and that model family starts to get interesting at 2b—so my hunch is we would need 4x or more the training data to get something that starts to feel like a useful conversational partner."