pull down to refresh

For an LLM to correctly respond to an instruction it must understand both the semantics and the domain (i.e., subject area) of a given task-instruction pair. However, syntax can also convey implicit information Recent work shows that syntactic templates—frequent sequences of Part-of-Speech (PoS) tags—are prevalent in training data and often appear in model outputs.
Using a synthetic training dataset, we find that the syntactic-domain correlation can lower performance (mean 0.51 ± 0.06) on entity knowledge tasks in OLMo-2 models (1B-13B). We introduce an evaluation framework to detect this phenomenon in trained models, and show that it occurs on a subset of the FlanV2 dataset in open (OLMo-2-7B; Llama-4-Maverick), and closed (GPT-4o) models.
Finally, we present a case study on the implications for safety finetuning, showing that unintended syntactic-domain correlations can be used to bypass refusals in OLMo-2-7B Instruct and GPT-4o.
Our findings highlight two needs: (1) to explicitly test for syntactic-domain correlations, and (2) to ensure syntactic diversity in training data, specifically within domains, to prevent such spurious correlations.
Hummm.... what does this mean, in terms a layperson can understand?
It sounds to me like like model performance goes down when, in the training data, there's a lot of correlation between speech patterns and domains. As an extreme example, imagine that all the medical data used professional language and all the movie data used casual language. Then, the model can't distinguish between domain and speech pattern, which can lower performance (for example, if you were to ask about medical knowledge using casual language).
Not entirely sure if my description is accurate, but that's what I'm getting out of it...
reply
Most concrete example from there:
For instance, an LLM might learn that a question like “Where is Paris located?” is structured as adverb/verb/proper noun/verb. If there are many examples of sentence construction in the model’s training data, the LLM may associate that syntactic template with questions about countries. So, if the model is given a new question with the same grammatical structure but nonsense words, like “Quickly sit Paris clouded?” it might answer “France” even though that answer makes no sense.

It sounds to me like model performance goes down when, in the training data, there's a lot of correlation between speech patterns and domains.
I think the issue is more about reoccurring speech patterns inside domains.
reply
100 sats \ 1 reply \ @0xbitcoiner 6h
Seems like CGPT ain’t got that disease! Ahaha
Quickly sit Paris clouded?
ChatGPT Free:
I’m not sure what you mean by “Quickly sit Paris clouded?” It looks like a sentence fragment or a poetic phrase.
Would you like me to:
Interpret it?
Rewrite it into a clearer sentence?
Guess what you intended to ask (maybe about weather in Paris or a description)?
Let me know!
reply
They claim they've reproduced it (though maybe not this particular example) on 4o
reply
Hmm, interesting.
LLMs are weird.
reply
0 sats \ 0 replies \ @adlai 2h
Thank you for linking this; to my taste, too much of news about AI/LLMs is some combination of opinions, predictions, and shallow promotions of experiments at gluing the existing tech into a new domain, and not enough research into understanding the training process and its resulting models.
Admittedly, my disappointment might be more due to where I source news.
reply