Author: smith.dev (did:plc:kdjm3mwvbvbvaqtb6r3it5qq)

Collections

Record🔒🤔

uri:

"at://did:plc:kdjm3mwvbvbvaqtb6r3it5qq/app.bsky.feed.post/3jufstzgxu42u"

cid:

"bafyreiat5njbiuiuk43jiw4ohbxnnr5gef3uqf5m6yecyrhvkxdoj2ghlu"

value:

text:

"one thing i haven’t spent time to understand yet is it seems like LLM tokenizers don’t bother with the kinds of preprocessing or normalizing you’d do for traditional search queries like stemming, case, etc. - i guess this is because in the vector space small differences don’t amount to much?"

$type:

"app.bsky.feed.post"

reply:

root:

cid:: "bafyreigfpzsm7oxgeev7fzd3i7k6nv4vfo6c47w4lqyrflfv64twmsusuy"
uri:: "at://did:plc:kft6lu4trxowqmter2b6vg6z/app.bsky.feed.post/3jufrpl3eul2n"

parent:

cid:: "bafyreigfpzsm7oxgeev7fzd3i7k6nv4vfo6c47w4lqyrflfv64twmsusuy"
uri:: "at://did:plc:kft6lu4trxowqmter2b6vg6z/app.bsky.feed.post/3jufrpl3eul2n"

createdAt:

"2023-04-28T04:49:49.181Z"