-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ByteString should not be considered Text #107
Comments
Yes. I'm not a fan of that decision made a decade ago. But changing that would be a big breaking change, which wouldn't cause type-errors, so I'm hesitant to "just" do it. If anything, I'd first remove |
It's pretty clearly incorrect right now. I'm sure something might break, but having it assume a I think it'd be fine to remove the instances and/or document how to handle |
Using |
Why would that break anything? My understanding is that it can only store a subset of valid |
Inserting as binary *Database.PostgreSQL.Simple Data.ByteString> execute conn "INSERT INTO foo(bar) VALUES (?);" (Only (Binary ("foo" :: ByteString)))
1 querying as text *Database.PostgreSQL.Simple Data.ByteString> query_ conn "SELECT * from foo;" :: IO [Only ByteString]
[Only {fromOnly = "\\x666f6f"}] doesn't roundtrip. Also other way around doesn't work. Insertting as text *Database.PostgreSQL.Simple Data.ByteString> execute conn "INSERT INTO foo(bar) VALUES (?);" (Only ("foo" :: ByteString))
1 but querying as binary: *Database.PostgreSQL.Simple Data.ByteString> query_ conn "SELECT * from foo;" :: IO [Only (Binary ByteString)]
*** Exception: Incompatible {errSQLType = "text", errSQLTableOid = Just (Oid 27717), errSQLField = "bar", errHaskellType = "Binary ByteString", errMessage = "types incompatible"} with
as the schema. |
OK, but didn't you throw that exception on the last one? In any case, the example is obviously flawed, though someone probably has written that code somewhere. It'd be fine with making it not compile as it would've at least not told me that I had a bad UTF-8 character in my |
That example illustrates that if there is an existent software which uses textual bytestrings, it successfully inserts and queries the data. It's true that those oid-mismatch exceptions are thrown by As I said, that design wart is made over a decade ago, and it's not easy to unwrap. The type-checker doesn't help with migration, except if the instance is removed, but that will cause inconveniences too. (The |
Now that deprecated instances are accepted and implemented, can we revisit this with the plan of adding a deprecation and eventually removing and then adding back the proper instance? |
We're fighting the same issue here.
It is only needed for ToField, as FromField works correctly. It seems that's the least disruptive option, as it can be simply fixed in downstream code by either defining orphan instance (for expediency) or by moving to newtypes. Existence of ByteString instance creates risk to code breaking intermittently, and it's very hard to debug. Aeson also does not define instances for ByteString, and it's a more sane decision than any alternative. If there would be an agreement to eventually remove it, we'd just fork it now and use fork till it's merged. |
I just ran into this. Inserting bytestring fails while attempting to decode it as UTF-8 (!). I agree it doesn't make sense to assume a ByteString value to have UTF-8 encoding. What's the current workaround? If the correct behavior won't be implemented, at least it'd be nice to have a |
This works: import Data.ByteString (ByteString)
import Data.ByteString qualified as ByteString
import Data.Binary.Builder qualified as Builder
import Database.PostgreSQL.Simple.ToField qualified as SQL
import Numeric (showHex)
newtype SQLByteString = SQLByteString ByteString
instance SQL.ToField SQLByteString where
toField (SQLByteString bytes) = SQL.Plain $ SQL.inQuotes $
"\\x" <> ByteString.foldr
(\byte r -> (if byte <= 0xF then Builder.putCharUtf8 '0' else mempty)
<> Builder.putStringUtf8 (showHex byte [])
<> r
) mempty bytes I can add this in a PR if there's interest. The newtype can be named whatever you like. I'm bad at names. I guess it could go in |
There's already a type Treating |
I see! I had totally missed this. Thanks! |
The
ToField
instance ofByteString
usesEscape
which is meant for text-like strings which is a clearly invalid assumption forByteString
.The
Binary
wrapper can be used to treatByteString
as non-text data, but that's redundant. The default causes character encoding issues when storingbytea
values that may not appear in trivial tests. I found this by means of a property test that reported bad utf-8 characters in jpeg thumbnails I was storing.The text was updated successfully, but these errors were encountered: