🏠
Author: smith.dev (did:plc:kdjm3mwvbvbvaqtb6r3it5qq)

Record🤔

uri:
"at://did:plc:kdjm3mwvbvbvaqtb6r3it5qq/app.bsky.feed.post/3jufstzgxu42u"
cid:
"bafyreiat5njbiuiuk43jiw4ohbxnnr5gef3uqf5m6yecyrhvkxdoj2ghlu"
value:
text:
"one thing i haven’t spent time to understand yet is it seems like LLM tokenizers don’t bother with the kinds of preprocessing or normalizing you’d do for traditional search queries like stemming, case, etc. - i guess this is because in the vector space small differences don’t amount to much?"
$type:
"app.bsky.feed.post"
reply:
root:
cid:
"bafyreigfpzsm7oxgeev7fzd3i7k6nv4vfo6c47w4lqyrflfv64twmsusuy"
parent:
cid:
"bafyreigfpzsm7oxgeev7fzd3i7k6nv4vfo6c47w4lqyrflfv64twmsusuy"
createdAt:
"2023-04-28T04:49:49.181Z"