Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputready fd not set high following NCKEY_EOF #2525

Closed
dankamongmen opened this issue Jan 4, 2022 · 12 comments · Fixed by #2528
Closed

inputready fd not set high following NCKEY_EOF #2525

dankamongmen opened this issue Jan 4, 2022 · 12 comments · Fixed by #2528
Assignees
Labels
bug Something isn't working input readin' dem bytes
Milestone

Comments

@dankamongmen
Copy link
Owner

I can constantly read NCKEY_EOF, but the inputready fd is never high. If I add a check on input readiness, I never see the NCKEY_EOF. Test with cli1. See also #2521.

@dankamongmen dankamongmen added bug Something isn't working input readin' dem bytes labels Jan 4, 2022
@dankamongmen dankamongmen added this to the 3.1.0 milestone Jan 4, 2022
@dankamongmen dankamongmen self-assigned this Jan 4, 2022
@dankamongmen
Copy link
Owner Author

so the weird thing is....with the poll(), not only do i block, but i never get the NCKEY_EOF:

GOT: 0x001103f1
GOT: 0x0000006c
GOT: 0x0000006c
GOT: 0x0000006c
GOT: 0x0000006c
GOT: 0x001103f1
GOT: 0x001103f1
GOT: 0x00110424
GOT: 0x00000064
GOT: 0x00110424
GOT: 0x00000064
GOT: 0x00000064
GOT: 0x00000064
GOT: 0x00000071

so we're passing through the poll() any number of times -- NCKEY_EOF simply isn't being generated.

without the poll, we immediately get it, and exit:

GOT: 0x0011056c
2 renders, 440.48µs (147.32µs min, 220.24µs avg, 293.16µs max)
2 rasters, 111.63µs (54.60µs min, 55.81µs avg, 57.03µs max)
2 writes, 70.70µs (26.53µs min, 35.35µs avg, 44.17µs max)
55B (0B min, 27B avg, 55B max) 0 inputs Ghpa: 1
0 failed renders, 0 failed rasters, 0 refreshes, 0 input errors
RGB emits:elides: def 1:24 fg 0:0 bg 0:0
Cell emits:elides: 25:23735 (99.89%) 96.00% 0.00% 0.00%
Bitmap emits:elides: 0:0 (0.00%) 0B (0.00%) SuM: 0 (0.00%)

weird! but surely also the key to the problem.

@dankamongmen
Copy link
Owner Author

dankamongmen commented Jan 4, 2022

we only generate NCKEY_EOF when we come through internal_get() and hit:

pthread_mutex_lock(&ictx->ilock);                                                                                                 
  while(!ictx->ivalid){                                                                                                             
    if(ictx->stdineof){                                                                                                             
      pthread_mutex_unlock(&ictx->ilock);                                                                                           
      logwarn("read eof on stdin");                                                                                                 
      if(ni){                                                                                                                       
        memset(ni, 0, sizeof(*ni));                                                                                                 
        ni->id = NCKEY_EOF;                                                                                                         
      }                                                                                                                             
      return NCKEY_EOF;                                                                                                             
    }                

so if we always hit this and have data available -- which we do, if we're checking poll() beforehand -- we never generate NCKEY_EOF.

the solution here is tripartite:
(1) resolve the main thrust of this bug by ensuring the inputready fd is readable once we hit NCKEY_EOF
(2) once we hit NCKEY_EOF, do not queue further input coming from the terminal
(3) always generate a NCKEY_EOF synthesized event upon hitting eof

@dankamongmen
Copy link
Owner Author

so yeah it doesn't look like we "generate an NCKEY_EOF" like we do other ncinputs, pushing it into the queue; it's an immediate result returned outside that paradigm. we see it down in read_input_nblock(), called from read_inputs_nblock(), i think process_input() needs to detect the level change, and when it happens, feed a queued event through.

@dankamongmen
Copy link
Owner Author

we're seeing a livelock in our drone CI when doing notcurses-input < notfcurses-input, and a deadlock when doing notcurses-input < /dev/null. this matches what we're seeing in Debian CI.

i got perf output from a livelock:

  44.34%  notcurses-input  libpthread-2.33.so          [.] __pthread_mutex_unlock_usercnt42.98%  notcurses-input  libpthread-2.33.so          [.] __pthread_mutex_lock6.78%  notcurses-input  libnotcurses-core.so.3.0.3  [.] process_ncinput3.28%  notcurses-input  libnotcurses-core.so.3.0.3  [.] process_melange1.21%  notcurses-input  libnotcurses-core.so.3.0.3  [.] pthread_mutex_lock@plt1.05%  notcurses-input  libnotcurses-core.so.3.0.3  [.] pthread_mutex_unlock@plt 

@dankamongmen
Copy link
Owner Author

if we're spinning on pthreads....hrmmm, are we certain that's a livelock, and not just a long-lived process? eh, it's running twenty minutes, so presumably...

dankamongmen added a commit that referenced this issue Jan 5, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
dankamongmen added a commit that referenced this issue Jan 5, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
dankamongmen added a commit that referenced this issue Jan 5, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
@dankamongmen
Copy link
Owner Author

ok, so there's one more issue, and this is presumably why our CI is failing. we can get the EOF before we get the flurry of terminal responses, as we probably do when redirecting from /dev/null, see here:

link_numeric:380:adding numeric with follow ; following ]4;                                                                         
link_numeric:393:phi node: 427->428                                                                                                 
link_numeric:394:eta node: 429 philink[;]: 429                                                                                      
insert_path:481:added fixed r 114 as 430                                                                                            
insert_path:481:added fixed g 103 as 431                                                                                            
insert_path:481:added fixed b 98 as 432                                                                                             
insert_path:481:added fixed : 58 as 433                                                                                             
esctrie_make_string:181:made string: 435                          <------ done building automaton                                                                  
block_on_input:2176:blocking on input availability                                                                                  
block_on_input:2233:waiting on 2 fds (ibuf: 0/8192)                                                                                 
block_on_input:2260:poll returned 2                                                                                                 
block_on_input:2274:got events: TI                                                                                                  
read_input_nblock:1871:read 4095B from 8 (4097B left)    <------------ read a bunch of terminal reply                                                                           
read_input_nblock:1862:got EOF on 0                                <------------ read eof on stdin                                                                 
process_escape:1908:initialized automaton to 1                 <------------ process terminal reply                                                                     
process_escape:1920:walk result on 91 ([): 0 4                                                                                      
process_escape:1920:walk result on 55 (7): 0 269                                                                                    
process_escape:1920:walk result on 52 (4): 0 269                                                                                    
process_escape:1920:walk result on 59 (;): 0 270                                                                                    
process_escape:1920:walk result on 49 (1): 0 271                                                                                    
process_escape:1920:walk result on 82 (R): 2 315             

and here we never really process the EOF.

@dankamongmen
Copy link
Owner Author

hrmm, no, i just hadn't brought my local checkout up to speed. once i did, this problem goes away. we're still breaking in CI, though =.

@dankamongmen
Copy link
Owner Author

so in the drone context where we're seeing this lockup, here's our open fds:

root@2d7b55c0ae13:/proc/3070/cwd# ls -l /proc/1/fd
total 0
lrwx------ 1 root root 64 Jan  5 21:42 0 -> /dev/null
l-wx------ 1 root root 64 Jan  5 21:42 1 -> 'pipe:[5645757]'
l-wx------ 1 root root 64 Jan  5 21:42 2 -> 'pipe:[5645758]'
root@2d7b55c0ae13:/proc/3070/cwd# 

@dankamongmen
Copy link
Owner Author

so different pipes for stdout and stderr, /dev/null for stdin

@dankamongmen
Copy link
Owner Author

environment:

root@2d7b55c0ae13:/proc/3070/cwd# cat /proc/1/environ 
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=2d7b55c0ae13DRONE_SYSTEM_VERSION=2.7.0DRONE_BUILD_TRIGGER=@hookDRONE_SCRIPT=

if [ ! -z "${DRONE_NETRC_FILE}" ]; then
	echo $DRONE_NETRC_FILE > $HOME/.netrc
	chmod 600 $HOME/.netrc
fi

unset DRONE_SCRIPT
unset DRONE_NETRC_MACHINE
unset DRONE_NETRC_USERNAME
unset DRONE_NETRC_PASSWORD
unset DRONE_NETRC_FILE

set -e

remote_debug() {
	if [ "$?" -ne "0" ]; then
		/usr/drone/bin/tmate -F
	fi
}

if [ "${DRONE_BUILD_DEBUG}" = "true" ]; then
	if [ ! -z "${DRONE_TMATE_HOST}" ]; then
		echo "set -g tmate-server-host $DRONE_TMATE_HOST" >> $HOME/.tmate.conf
		echo "set -g tmate-server-port $DRONE_TMATE_PORT" >> $HOME/.tmate.conf
		echo "set -g tmate-server-rsa-fingerprint $DRONE_TMATE_FINGERPRINT_RSA" >> $HOME/.tmate.conf
		echo "set -g tmate-server-ed25519-fingerprint $DRONE_TMATE_FINGERPRINT_ED25519" >> $HOME/.tmate.conf

		if [ ! -z "${DRONE_TMATE_AUTHORIZED_KEYS}" ]; then
			echo "$DRONE_TMATE_AUTHORIZED_KEYS" > $HOME/.tmate.authorized_keys
			echo "set -g tmate-authorized-keys \"$HOME/.tmate.authorized_keys\"" >> $HOME/.tmate.conf
		fi
	fi
	trap remote_debug EXIT
fi


echo + "export LANG=en_US.UTF-8"
export LANG=en_US.UTF-8

echo + "export TERM=xterm"
export TERM=xterm

echo + "mkdir build"
mkdir build

echo + "cd build"
cd build

echo + "cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_STATIC=off -DUSE_GPM=on -DUSE_QRCODEGEN=on -DDFSG_BUILD=on"
cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_STATIC=off -DUSE_GPM=on -DUSE_QRCODEGEN=on -DDFSG_BUILD=on

echo + "make -j2"
make -j2

echo + "./notcurses-input < /dev/null"
./notcurses-input < /dev/null

echo + "env NOTCURSES_LOGLEVEL=7 ./notcurses-input < notcurses-input"
env NOTCURSES_LOGLEVEL=7 ./notcurses-input < notcurses-input

echo + "./notcurses-input < notcurses-input"
./notcurses-input < notcurses-input
DRONE_SOURCE_BRANCH=dankamongmen/fucking-inputDRONE_COMMIT_AUTHOR_AVATAR=https://avatars.githubusercontent.com/u/143473?v=4DRONE_STAGE_OS=linuxCI_COMMIT_REF=refs/heads/dankamongmen/fucking-inputCI_PARENT_BUILD_NUMBER=0DRONE_REMOTE_URL=https://github.com/dankamongmen/notcurses.gitDRONE_BUILD_NUMBER=10248DRONE_REPO_NAMESPACE=dankamongmenCI_REPO_LINK=https://github.com/dankamongmen/notcursesDRONE_REPO_PRIVATE=falseCI_COMMIT_SHA=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcCI_COMMIT_AUTHOR_NAME=nick blackDRONE_GIT_HTTP_URL=https://github.com/dankamongmen/notcurses.gitCI_REPO_NAME=dankamongmen/[email protected]_COMMIT_MESSAGE=[drone] try and get some useful logsDRONE_BRANCH=dankamongmen/fucking-inputCI_COMMIT_AUTHOR_AVATAR=https://avatars.githubusercontent.com/u/143473?v=4DRONE_REPO_LINK=https://github.com/dankamongmen/notcursesDRONE_BUILD_STATUS=successDRONE_STAGE_TYPE=dockerGIT_AUTHOR_NAME=nick blackDRONE_REPO_OWNER=dankamongmenCI=trueDRONE_BUILD_CREATED=1641418920DRONE_COMMIT_MESSAGE=[drone] try and get some useful logsDRONE_COMMIT=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcDRONE_WORKSPACE_BASE=/drone/srcDRONE_COMMIT_AUTHOR_NAME=nick blackDRONE_COMMIT_REF=refs/heads/dankamongmen/fucking-inputDRONE_REPO_BRANCH=masterDRONE_DOCKER_VOLUME_ID=drone-umi3O9KRd3GBRptjD3QiDRONE_BUILD_PARENT=0DRONE_STAGE_NAME=debian-unstableDRONE_COMMIT_AFTER=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcDRONE_BUILD_EVENT=pushCI_COMMIT_AUTHOR=dankamongmenDRONE_STAGE_STARTED=1641418921CI_BUILD_CREATED=1641418920DRONE_BUILD_LINK=https://drone.dsscaw.com:4443/dankamongmen/notcurses/10248DRONE_WORKSPACE=/drone/srcCI_BUILD_FINISHED=0CI_REPO_REMOTE=https://github.com/dankamongmen/[email protected]_BUILD_LINK=https://github.com/dankamongmen/notcurses/compare/25c54f875331...53d95ac10aa9DRONE_SYSTEM_PROTO=httpsDRONE_BUILD_FINISHED=1641418940DRONE_BUILD_DEBUG=falseDRONE_COMMIT_BEFORE=25c54f875331ac09a1b3c57239ec810c8bb2019cDRONE_DOCKER_NETWORK_ID=drone-hulmydnAlRkDgQ6DDagKCI_BUILD_NUMBER=10248DRONE_STAGE_FINISHED=1641418940DRONE_GIT_SSH_URL=git@github.com:dankamongmen/notcurses.gitDRONE_STAGE_STATUS=successDRONE_STAGE_ARCH=amd64CI_REPO_PRIVATE=falseDRONE_TARGET_BRANCH=dankamongmen/fucking-inputDRONE_STEP_NAME=debian-buildGIT_COMMITTER_NAME=nick blackCI_BUILD_EVENT=pushDRONE_STAGE_KIND=pipelineDRONE_SYSTEM_HOSTNAME=drone.dsscaw.com:4443DRONE_COMMIT_SHA=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcGIT_COMMITTER_EMAIL=dankamongmen@gmail.comDRONE_STAGE_VERSION=3GIT_TERMINAL_PROMPT=0DRONE_STAGE_MACHINE=vps1DRONE_STAGE_NUMBER=1DRONE_CARD_PATH=/dev/stdoutDRONE=trueDRONE_STEP_NUMBER=2DRONE_REPO_NAME=notcursesDRONE_DEPLOY_ID=0DRONE_BUILD_STARTED=1641418940CI_COMMIT_BRANCH=dankamongmen/fucking-inputDRONE_COMMIT_BRANCH=dankamongmen/fucking-inputDRONE_COMMIT_LINK=https://github.com/dankamongmen/notcurses/compare/25c54f875331...53d95ac10aa9DRONE_REPO_VISIBILITY=publicCI_REMOTE_URL=https://github.com/dankamongmen/notcurses.gitDRONE_COMMIT_AUTHOR=dankamongmenDRONE_SYSTEM_HOST=drone.dsscaw.com:4443CI_BUILD_STATUS=pendingCI_REPO=dankamongmen/notcursesCI_BUILD_STARTED=0DRONE_COMMIT_AUTHOR_EMAIL=dankamongmen@gmail.comDRONE_REPO=dankamongmen/notcursesHOME=/rootroot@2d7b55c0ae13:/proc/3070/cwd# 

@dankamongmen
Copy link
Owner Author

environment of actual notcurses-input:

./notcurses-inputroot@2d7b55c0ae13:/proc/3070/cwd# cat /proc/3070/environ 
DRONE_COMMIT_AUTHOR_AVATAR=https://avatars.githubusercontent.com/u/143473?v=4DRONE_BRANCH=dankamongmen/fucking-inputGIT_COMMITTER_NAME=nick blackDRONE_SYSTEM_HOST=drone.dsscaw.com:[email protected]:dankamongmen/[email protected]_BUILD_DEBUG=falseDRONE_STAGE_VERSION=3CI=trueHOSTNAME=2d7b55c0ae13DRONE_STAGE_OS=linuxDRONE_REPO_NAMESPACE=dankamongmenDRONE_REPO_LINK=https://github.com/dankamongmen/notcursesCI_BUILD_NUMBER=10248DRONE_TARGET_BRANCH=dankamongmen/fucking-inputDRONE_COMMIT_AUTHOR=dankamongmenCI_BUILD_STARTED=0DRONE_GIT_HTTP_URL=https://github.com/dankamongmen/notcurses.gitCI_BUILD_LINK=https://github.com/dankamongmen/notcurses/compare/25c54f875331...53d95ac10aa9DRONE_COMMIT_BRANCH=dankamongmen/fucking-inputHOME=/rootOLDPWD=/drone/srcDRONE_REPO_PRIVATE=falseDRONE_STAGE_TYPE=dockerDRONE_SYSTEM_PROTO=httpsGIT_AUTHOR_NAME=nick blackDRONE_STEP_NUMBER=2DRONE_BUILD_STATUS=successDRONE_REPO_VISIBILITY=publicCI_PARENT_BUILD_NUMBER=0DRONE_WORKSPACE_BASE=/drone/srcDRONE_COMMIT_BEFORE=25c54f875331ac09a1b3c57239ec810c8bb2019cDRONE_STAGE_ARCH=amd64CI_BUILD_EVENT=pushCI_COMMIT_SHA=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcCI_REPO_NAME=dankamongmen/notcursesCI_BUILD_CREATED=1641418920CI_COMMIT_AUTHOR_EMAIL=dankamongmen@gmail.comCI_COMMIT_REF=refs/heads/dankamongmen/fucking-inputDRONE_SOURCE_BRANCH=dankamongmen/fucking-inputDRONE_STAGE_NAME=debian-unstableDRONE_STAGE_STATUS=successDRONE_CARD_PATH=/dev/stdoutDRONE_COMMIT_MESSAGE=[drone] try and get some useful logsDRONE_REPO_BRANCH=masterCI_BUILD_FINISHED=0DRONE_SYSTEM_HOSTNAME=drone.dsscaw.com:4443DRONE_REMOTE_URL=https://github.com/dankamongmen/notcurses.gitCI_REPO=dankamongmen/notcursesTERM=xtermDRONE_BUILD_TRIGGER=@hookCI_COMMIT_AUTHOR_NAME=nick blackDRONE_REPO_OWNER=dankamongmenDRONE_STAGE_KIND=pipelinePATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binCI_COMMIT_AUTHOR_AVATAR=https://avatars.githubusercontent.com/u/143473?v=4DRONE=trueDRONE_BUILD_NUMBER=10248DRONE_BUILD_STARTED=1641418940DRONE_BUILD_PARENT=0DRONE_BUILD_LINK=https://drone.dsscaw.com:4443/dankamongmen/notcurses/10248DRONE_WORKSPACE=/drone/srcDRONE_COMMIT_AFTER=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcLANG=en_US.UTF-8CI_REPO_LINK=https://github.com/dankamongmen/notcursesCI_COMMIT_AUTHOR=dankamongmenDRONE_STAGE_STARTED=1641418921DRONE_STAGE_NUMBER=1DRONE_COMMIT=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcCI_REPO_REMOTE=https://github.com/dankamongmen/notcurses.gitGIT_TERMINAL_PROMPT=0CI_COMMIT_BRANCH=dankamongmen/fucking-inputCI_REPO_PRIVATE=falseDRONE_BUILD_EVENT=pushDRONE_BUILD_CREATED=1641418920DRONE_COMMIT_SHA=53d95ac10aa983bb379c7f55e493bd18ce8ab5fcDRONE_REPO_NAME=notcursesDRONE_COMMIT_AUTHOR_EMAIL=dankamongmen@gmail.comDRONE_COMMIT_REF=refs/heads/dankamongmen/fucking-inputDRONE_STAGE_MACHINE=vps1CI_BUILD_STATUS=pendingGIT_COMMITTER_EMAIL=dankamongmen@gmail.comDRONE_DOCKER_VOLUME_ID=drone-umi3O9KRd3GBRptjD3QiDRONE_DEPLOY_ID=0PWD=/drone/src/buildDRONE_BUILD_FINISHED=1641418940DRONE_COMMIT_LINK=https://github.com/dankamongmen/notcurses/compare/25c54f875331...53d95ac10aa9CI_COMMIT_MESSAGE=[drone] try and get some useful logsDRONE_STEP_NAME=debian-buildDRONE_REPO=dankamongmen/notcursesDRONE_SYSTEM_VERSION=2.7.0DRONE_COMMIT_AUTHOR_NAME=nick blackDRONE_DOCKER_NETWORK_ID=drone-hulmydnAlRkDgQ6DDagKDRONE_STAGE_FINISHED=1641418940CI_REMOTE_URL=https://github.com/dankamongmen/notcurses.gitroot@2d7b55c0ae13:/proc/3070/cwd# 

and now we just have more pipes:

root@2d7b55c0ae13:/proc/3070/cwd# ls -l /proc/3070/fd
total 0
lr-x------ 1 root root 64 Jan  5 22:25 0 -> /dev/null
l-wx------ 1 root root 64 Jan  5 22:25 1 -> 'pipe:[5645757]'
l-wx------ 1 root root 64 Jan  5 22:25 2 -> 'pipe:[5645758]'
lr-x------ 1 root root 64 Jan  5 22:25 3 -> 'pipe:[5655568]'
l-wx------ 1 root root 64 Jan  5 22:25 4 -> 'pipe:[5655568]'
lr-x------ 1 root root 64 Jan  5 22:25 5 -> 'pipe:[5655569]'
l-wx------ 1 root root 64 Jan  5 22:25 6 -> 'pipe:[5655569]'
root@2d7b55c0ae13:/proc/3070/cwd# 

dankamongmen added a commit that referenced this issue Jan 6, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
@dankamongmen
Copy link
Owner Author

+ ./notcurses-input < /dev/null
530	(Bnotcurses 3.0.3 on xterm (Linux 5.11.13-x86_64-linode143)
531	20 rows 76 cols 8 colors
532	gcc-11.2.0 (LE)
533	terminfo 6.2.20210905 libdeflate 1.8 GPM 2.1.0
534	avformat 58.76.100 avutil 56.70.100 swscale 5.9.100 avcodec 58.134.100
535	mouse_setup:21:no tty, not emitting mouse control
536	
537	IVALID: 0 EOF: 0 IREAD: 0
538	WHILE IVALID: 0 EOF: 0 IREAD: 0
539	WAITING

so we're sitting on the icond condition variable. hrmm.

dankamongmen added a commit that referenced this issue Jan 8, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
dankamongmen added a commit that referenced this issue Jan 8, 2022
When we get to the EOF on stdin, we need account for that
with the readiness fd. Post to it upon the EOF transition.
Closes #2525.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working input readin' dem bytes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant