The body reads first and answers second — Inter at weight 400, sentence case, with its qualifiers kept. No hype, no exclamation.
- def respond(q): - return model.answer(q) + async def respond(q): + async for tok in model.stream(q, n=3): + yield tok
- def respond(q): - return model.answer(q) + async def respond(q): + async for tok in model.stream(q): + yield tok