Skip to content
This repository has been archived by the owner on Aug 3, 2024. It is now read-only.

Avoid errors on non UTF-8 Windows #566

Merged
merged 2 commits into from
Jul 4, 2017
Merged

Conversation

igrep
Copy link
Contributor

@igrep igrep commented Jan 3, 2017

Problem

haddock exits with errors like below:

(1)

haddock: internal error: <stderr>: hPutChar: invalid argument (invalid character)

(2)

haddock: internal error: Language\Haskell\HsColour\Anchors.hs: hGetContents: invalid argument (invalid byte sequence)

(1) is caused by printing the "bullet" character onto stderr.
For example, this warning contains it:

Language\Haskell\HsColour\ANSI.hs:62:10: warning: [-Wmissing-methods]
    • No explicit implementation for
        ‘toEnum’
    • In the instance declaration for ‘Enum Highlight’

(2) is caused when the input file of readFile contains some Unicode characters.
In the case above, '⇒' is the cause.

Environment

OS: Windows 10
haddock: 2.17.3
GHC: 8.0.1

Solution

Add hSetEncoding handle utf8 to avoid the errors.

Note

  • I found the detailed causes by these changes for debugging:
  • These errors happen even after executing chcp 65001 on the console.
    According to the debug code, hGetEncoding stderr returns CP932 regardless of the console encoding.

Copy link
Member

@alexbiehl alexbiehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR. I am sorry for the late response. I am still catching up on the backlog.

@@ -165,6 +167,7 @@ createIfaces verbosity flags instIfaceMap mods = do
processModule :: Verbosity -> ModSummary -> [Flag] -> IfaceMap -> InstIfaceMap -> Ghc (Maybe Interface)
processModule verbosity modsum flags modMap instIfaceMap = do
out verbosity verbose $ "Checking module " ++ moduleString (ms_mod modsum) ++ "..."
liftIO $ hSetEncoding stderr utf8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is a good idea. We would need a graceful degradation mechanism if the terminal isn't utf8 enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm gonna look into how GHC handles such encoding incompatibility.
Tell me if you have any hints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally found a workaround:

import GHC.IO.Encoding.CodePage
import GHC.IO.Encoding.Failure
import System.IO

hSetEncoding stderr $ mkLocaleEncoding TransliterateCodingFailure

I'm trying to update the patch and rebase, but I have a difficulty testing with GHC 8.2.
The latest master is now for GHC 8.2, but I failed to install GHC 8.2 on my Windows... 😞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, mistaken.
I failed to build cabal-install with GHC 8.2 on my Windows. Actually, I successfully installed GHC 8.2.

Copy link
Member

@alexbiehl alexbiehl Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @igrep!

cabal-install: Are you trying cabal-install HEAD from git? What is the error message do you get?

And for haddock: Unfortunately ghc-8.2 added a new function which isn't present in ghc-8.2-rc2 but haddock relies upon. If you experience this try reverting d5d8cd1. Then you should be able to compile.

Also you could make your change and push it here. Travis will indicate if it is good or not.

Copy link
Contributor Author

@igrep igrep Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I got it. I just should have installed standalone cabal-install from https://www.haskell.org/cabal/download.html ...
Maybe fixed the problem! Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I am eager to to review your change as I am currently experiencing those utf8 errors myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG! Another error when building haddock... cabal's bug?

$ cabal new-build
In order, the following will be built (use -v for more details):
 - ghc-paths-0.1.0.9 (lib:ghc-paths) (requires build)
 - haddock-api-2.18.0 (lib) (first run)
 - haddock-2.18.0 (exe:haddock) (first run)
Configuring ghc-paths-0.1.0.9 (all, legacy fallback)...
Building ghc-paths-0.1.0.9 (all, legacy fallback)...

Failed to build ghc-paths-0.1.0.9. The failure occurred during the final
install step.
Build log (
C:\Users\igrep\AppData\Roaming\cabal\logs\ghc-8.2.0.20170507\ghc-paths-0.1.0.9-2b1e83330a86c551a66824a766311413e1d9a984.log
):
Installing library in C:\Users\igrep\AppData\Roaming\cabal\store\ghc-8.2.0.20170507\incoming\new-14172\Users\igrep\AppData\Roaming\cabal\store\ghc-8.2.0.20170507\ghc-paths-0.1.0.9-2b1e83330a86c551a66824a766311413e1d9a984\lib
copyFile: does not exist (指定されたパスが見つかりません。)box\prg\foreign\haddock\dist-newstyle\tmp\src-14172\ghc-paths-0.1.0.9\dist\set
up\setup.exe ...
Configuring ghc-paths-0.1.0.9...
cabal.exe: Failed to build ghc-paths-0.1.0.9 (which is required by exe:haddock
from haddock-2.18.0). See the build log above for details.

exit status 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like haskell/cabal#4515

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine! 🎉

$ haddock-master.exe Language/Haskell/HsColour/ANSI.hs

Language\Haskell\HsColour\ANSI.hs:62:10: warning: [-Wmissing-methods]
    ? No explicit implementation for
        ▒etoEnum▒f
    ? In the instance declaration for ▒eEnum Highlight▒f
   |
62 | instance Enum Highlight where
   |          ^^^^^^^^^^^^^^
Haddock's resource directory does not exist!

Haddock coverage:
  88% (  7 /  8) in 'Language.Haskell.HsColour.ColourHighlight'
  Missing documentation for:
    Module header
  33% (  1 /  3) in 'Language.Haskell.HsColour.Output'
  Missing documentation for:
    Module header
    TerminalType (.\Language\Haskell\HsColour\Output.hs:3)
  37% ( 10 / 27) in 'Language.Haskell.HsColour.ANSI'
  Missing documentation for:
    highlightOnG (Language/Haskell/HsColour/ANSI.hs:91)
    highlightOff (Language/Haskell/HsColour/ANSI.hs:96)
    cleareol (Language/Haskell/HsColour/ANSI.hs:46)
    clearbol (Language/Haskell/HsColour/ANSI.hs:47)
    clearline (Language/Haskell/HsColour/ANSI.hs:48)
    clearDown (Language/Haskell/HsColour/ANSI.hs:49)
    clearUp (Language/Haskell/HsColour/ANSI.hs:50)
    cursorUp (Language/Haskell/HsColour/ANSI.hs:41)
    cursorDown (Language/Haskell/HsColour/ANSI.hs:42)
    cursorLeft (Language/Haskell/HsColour/ANSI.hs:44)
    cursorRight (Language/Haskell/HsColour/ANSI.hs:43)
    savePosition (Language/Haskell/HsColour/ANSI.hs:55)
    restorePosition (Language/Haskell/HsColour/ANSI.hs:56)
    scrollUp (Language/Haskell/HsColour/ANSI.hs:118)
    scrollDown (Language/Haskell/HsColour/ANSI.hs:116)
    lineWrap (Language/Haskell/HsColour/ANSI.hs:122)
    TerminalType (.\Language\Haskell\HsColour\Output.hs:3)
Warning: Language.Haskell.HsColour.ColourHighlight: could not find link destinations for:
    Word8 Enum succ pred toEnum Int fromEnum enumFrom enumFromThen enumFromTo enumFromThenTo Eq == Bool /= Read readsPrec ReadS readList readPrec ReadPrec readListPrec Show showsPrec ShowS show String showList Integral
Warning: Language.Haskell.HsColour.Output: could not find link destinations for:
    Eq == Bool /= Ord compare Ordering < <= > >= max min Show showsPrec Int ShowS show String showList
Warning: Language.Haskell.HsColour.ANSI: could not find link destinations for:
    String Char Int Eq == Bool /= Read readsPrec ReadS readList readPrec ReadPrec readListPrec Show showsPrec ShowS show showList Word8 Enum succ pred toEnum fromEnum enumFrom enumFromThen enumFromTo enumFromThenTo Ord compare Ordering < <= > >= max min
$ haddock.exe Language/Haskell/HsColour/ANSI.hs

Language\Haskell\HsColour\ANSI.hs:62:10: warning: [-Wmissing-methods]
    Haddock coverage:
  88% (  7 /  8) in 'Language.Haskell.HsColour.ColourHighlight'
  Missing documentation for:
    Module header
  33% (  1 /  3) in 'Language.Haskell.HsColour.Output'
  Missing documentation for:
    Module header
    TerminalType (.\Language\Haskell\HsColour\Output.hs:3)
haddock: internal error: <stderr>: hPutChar: invalid argument (invalid character)

rawSrc = readFile $ msHsFilePath ms
rawSrc = readFileUtf8 $ msHsFilePath ms

readFileUtf8 :: FilePath -> IO String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one on the other hand is a very good idea!

Problem
====

haddock exits with errors like below:

`(1)`

```
haddock: internal error: <stderr>: hPutChar: invalid argument (invalid character)
```

`(2)`

```
haddock: internal error: Language\Haskell\HsColour\Anchors.hs: hGetContents: invalid argument (invalid byte sequence)
```

`(1)` is caused by printing [the "bullet" character](http://www.fileformat.info/info/unicode/char/2022/index.htm) onto stderr.
For example, this warning contains it:

```
Language\Haskell\HsColour\ANSI.hs:62:10: warning: [-Wmissing-methods]
    • No explicit implementation for
        ‘toEnum’
    • In the instance declaration for ‘Enum Highlight’
```

`(2)` is caused when the input file of `readFile` contains some Unicode characters.
In the case above, '⇒' is the cause.

Environment
----

OS: Windows 10
haddock: 2.17.3
GHC: 8.0.1

Solution
====

Add `hSetEncoding handle utf8` to avoid the errors.

Note
====

- I found the detailed causes by these changes for debugging:
    - haskell@8f29edb
    - haskell@1dd23bf
- These errors happen even after executing `chcp 65001` on the console.
  According to the debug code, `hGetEncoding stderr` returns `CP932` regardless of the console encoding.
@alexbiehl
Copy link
Member

alexbiehl commented Jul 4, 2017 via email

@igrep igrep force-pushed the windows-non-utf8 branch from d4009bd to 0333dcc Compare July 4, 2017 13:19
@@ -165,6 +172,9 @@ createIfaces verbosity flags instIfaceMap mods = do
processModule :: Verbosity -> ModSummary -> [Flag] -> IfaceMap -> InstIfaceMap -> Ghc (Maybe Interface)
processModule verbosity modsum flags modMap instIfaceMap = do
out verbosity verbose $ "Checking module " ++ moduleString (ms_mod modsum) ++ "..."
#if defined(mingw32_HOST_OS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! But one last nit: Wouldn't it make sense to set this up somewhere upstream in the call hierarchie?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And: Do we need the CPP here? Couldn't this potentially happen on non Windows systems too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see that mkLocaleEncoding only exists on Windows, thats ok then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about readPackagesAndProcessModules? Or processModules ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put it in haddockWithGhc it is the toplevel entrypoint. So we are sure nothing slips through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Hmmm... processModules looks better because it's one of the public API of Documentation.Haddock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, do it!

…character)' non UTF-8 Windows

Better solution for 5941175's (1)
@igrep igrep force-pushed the windows-non-utf8 branch from 0333dcc to 855118e Compare July 4, 2017 13:51
@alexbiehl
Copy link
Member

Looks good to me! As soon as it is green I will merge and ping ben to make sure this is included in the release for ghc-8.2.

@alexbiehl alexbiehl merged commit 22cbf4d into haskell:master Jul 4, 2017
@igrep igrep deleted the windows-non-utf8 branch July 4, 2017 14:16
@alexbiehl
Copy link
Member

Thanks again @igrep! ben just confirmed this will be part of ghc-8.2! So you can enjoy your fruits of work soon!

@igrep
Copy link
Contributor Author

igrep commented Jul 8, 2017

Thank you! I confirmed GHC 8.2.1 rc3's haddock contains my patch! 😄

igrep pushed a commit to igrep/haddock that referenced this pull request Jul 23, 2018
Steps to reproduce and the error message
====

```
> stack haddock basement
... snip ...
    Warning: 'A' is out of scope.
    Warning: 'haddock: internal error: <stdout>: commitBuffer: invalid argument (invalid character)
```

Environment
====

OS: Windows 10 ver. 1709
haddock: [HEAD of ghc-8.4 when I reproduce the error](haskell@532b209). (I had to use this version to avoid another probrem already fixed in HEAD)
GHC: 8.4.3
stack: Version 1.7.1, Git revision 681c800873816c022739ca7ed14755e85a579565 (5807 commits) x86_64 hpack-0.28.2

Related pull request
====

haskell#566
alexbiehl pushed a commit that referenced this pull request Jul 23, 2018
…892)

Steps to reproduce and the error message
====

```
> stack haddock basement
... snip ...
    Warning: 'A' is out of scope.
    Warning: 'haddock: internal error: <stdout>: commitBuffer: invalid argument (invalid character)
```

Environment
====

OS: Windows 10 ver. 1709
haddock: [HEAD of ghc-8.4 when I reproduce the error](532b209). (I had to use this version to avoid another probrem already fixed in HEAD)
GHC: 8.4.3
stack: Version 1.7.1, Git revision 681c800873816c022739ca7ed14755e85a579565 (5807 commits) x86_64 hpack-0.28.2

Related pull request
====

#566
alanz pushed a commit that referenced this pull request Mar 25, 2020
…892)

Steps to reproduce and the error message
====

```
> stack haddock basement
... snip ...
    Warning: 'A' is out of scope.
    Warning: 'haddock: internal error: <stdout>: commitBuffer: invalid argument (invalid character)
```

Environment
====

OS: Windows 10 ver. 1709
haddock: [HEAD of ghc-8.4 when I reproduce the error](532b209). (I had to use this version to avoid another probrem already fixed in HEAD)
GHC: 8.4.3
stack: Version 1.7.1, Git revision 681c800873816c022739ca7ed14755e85a579565 (5807 commits) x86_64 hpack-0.28.2

Related pull request
====

#566
hubot pushed a commit to ghc/ghc that referenced this pull request May 17, 2024
…#892)

Steps to reproduce and the error message
====

```
> stack haddock basement
... snip ...
    Warning: 'A' is out of scope.
    Warning: 'haddock: internal error: <stdout>: commitBuffer: invalid argument (invalid character)
```

Environment
====

OS: Windows 10 ver. 1709
haddock: [HEAD of ghc-8.4 when I reproduce the error](haskell/haddock@532b209). (I had to use this version to avoid another probrem already fixed in HEAD)
GHC: 8.4.3
stack: Version 1.7.1, Git revision 681c800873816c022739ca7ed14755e85a579565 (5807 commits) x86_64 hpack-0.28.2

Related pull request
====

haskell/haddock#566
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants