Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to create a Homa server results in a Segmentation Fault #15

Open
chwebb02 opened this issue Jun 11, 2024 · 15 comments
Open

Attempting to create a Homa server results in a Segmentation Fault #15

chwebb02 opened this issue Jun 11, 2024 · 15 comments

Comments

@chwebb02
Copy link

chwebb02 commented Jun 11, 2024

Executing test_server results in a segmentation fault created in the execution of the BuildAndStart() method. Below is the stack backtrace produced by GDB.

(gdb) bt
#0  0x00007ffff7da4d74 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator std::basic_string_view<char, std::char_traits<char> >() const ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x0000555555a4a193 in std::shared_ptr<grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Node> grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Get<std::basic_string_view<char, std::char_traits<char> > >(std::shared_ptr<grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Node> const&, std::basic_string_view<char, std::char_traits<char> > const&) ()
#2  0x0000555555a48ae5 in grpc_core::ChannelArgs::Value const* grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Lookup<std::basic_string_view<char, std::char_traits<char> > >(std::basic_string_view<char, std::char_traits<char> > const&) const ()
#3  0x0000555555a4385b in grpc_core::ChannelArgs::Get(std::basic_string_view<char, std::char_traits<char> >) const ()
#4  0x0000555555a44d19 in grpc_core::ChannelArgs::GetString(std::basic_string_view<char, std::char_traits<char> >) const ()
#5  0x0000555555b3ce19 in grpc_core::Channel::Create(char const*, grpc_core::ChannelArgs, grpc_channel_stack_type, grpc_transport*) ()
#6  0x0000555555b5020e in grpc_core::Server::SetupTransport(grpc_transport*, grpc_pollset*, grpc_core::ChannelArgs const&, grpc_core::RefCountedPtr<grpc_core::channelz::SocketNode> const&)
    ()
#7  0x0000555555624814 in HomaListener::Transport::start (this=0x555556aded90, server=0x555556aea8f0, pollsets=0x555556aea940) at ../grpc/src/core/lib/surface/server.h:130
#8  0x0000555555b500f4 in grpc_core::Server::Start() ()
#9  0x0000555555b55ae0 in grpc_server_start ()
#10 0x00005555558dec1c in grpc::Server::Start(grpc::ServerCompletionQueue**, unsigned long) ()
#11 0x00005555558cede0 in grpc::ServerBuilder::BuildAndStart() ()
#12 0x00005555555ff0a8 in main (argc=<optimized out>, argv=<optimized out>) at test_server.cc:127
@johnousterhout
Copy link
Member

Sorry for my slow response. What version of gRPC are you working with?

@chwebb02
Copy link
Author

No worries, thank you for helping! I am using gRPC 1.57.0 with the most recent version of the Homa Kernel Module running on top of Ubuntu 22.04 LTS with Linux kernel version 6.1.38.

@johnousterhout
Copy link
Member

johnousterhout commented Jun 20, 2024 via email

@chwebb02
Copy link
Author

After setting DEBUG to yes in the Makefile, there was no segmentation fault. However, the problem persists when compiling with DEBUG set to no.

@johnousterhout
Copy link
Member

johnousterhout commented Jun 21, 2024 via email

@chwebb02
Copy link
Author

chwebb02 commented Jun 27, 2024

Sorry for delay. Here is the stack trace using gdb


#0  0x00007ffff7da4d74 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator std::basic_string_view<char, std::char_traits<char> >() const ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x0000555555a4a193 in std::shared_ptr<grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Node> grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Get<std::basic_string_view<char, std::char_traits<char> > >(std::shared_ptr<grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Node> const&, std::basic_string_view<char, std::char_traits<char> > const&) ()
#2  0x0000555555a48ae5 in grpc_core::ChannelArgs::Value const* grpc_core::AVL<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, grpc_core::ChannelArgs::Value>::Lookup<std::basic_string_view<char, std::char_traits<char> > >(std::basic_string_view<char, std::char_traits<char> > const&) const ()
#3  0x0000555555a4385b in grpc_core::ChannelArgs::Get(std::basic_string_view<char, std::char_traits<char> >) const ()
#4  0x0000555555a44d19 in grpc_core::ChannelArgs::GetString(std::basic_string_view<char, std::char_traits<char> >) const ()
#5  0x0000555555b3ce19 in grpc_core::Channel::Create(char const*, grpc_core::ChannelArgs, grpc_channel_stack_type, grpc_transport*) ()
#6  0x0000555555b5020e in grpc_core::Server::SetupTransport(grpc_transport*, grpc_pollset*, grpc_core::ChannelArgs const&, grpc_core::RefCountedPtr<grpc_core::channelz::SocketNode> const&)
    ()
#7  0x0000555555624814 in HomaListener::Transport::start(grpc_core::Server*, std::vector<grpc_pollset*, std::allocator<grpc_pollset*> > const*) ()
#8  0x0000555555b500f4 in grpc_core::Server::Start() ()
#9  0x0000555555b55ae0 in grpc_server_start ()
#10 0x00005555558dec1c in grpc::Server::Start(grpc::ServerCompletionQueue**, unsigned long) ()
#11 0x00005555558cede0 in grpc::ServerBuilder::BuildAndStart() ()
#12 0x00005555555ff0a8 in main ()

@johnousterhout
Copy link
Member

Thanks for the stack trace (I recently noticed that you already sent this earlier... sorry for making you send it again). It appears that the channel arguments object is somehow getting corrupted. Can you try the following steps?

  • Replace your version of homa_listener.cc with the file homa_listener.txt that I have attached.
  • Compile and run test_server; before it crashes, it should print out the names of all the channel arguments.
  • Find the (new) PrintChannelArgs method in homa_listener.cc and enter all of the argument names into the names.push_back statements, adjusting the number of statements to reflect the number of arguments (believe it or not, there is no way to actually query the names of the channel arguments at runtime).
  • Now run the program again, both compiled for debugging and compiled without debugging, and respond back with the results in each case.

Hopefully this will help to narrow down the problem a bit.

homa_listener.txt

@chwebb02
Copy link
Author

chwebb02 commented Jul 16, 2024

Sorry for my late response. I am not able to compile with the new homa_listener. I get the following output:

homa_listener.cc: In member function ‘void HomaListener::Transport::start(grpc_core::Server*, const std::vector<grpc_pollset*>*)’:
homa_listener.cc:297:28: error: ‘const class grpc_core::ChannelArgs’ has no member named ‘printNames’
  297 |     server->channel_args().printNames();
      |                            ^~~~~~~~~~

@johnousterhout
Copy link
Member

johnousterhout commented Jul 16, 2024 via email

@chwebb02
Copy link
Author

I was not able to find the method when searching through the file. I was able to determine that I am using gRPC 1.57.0 and issuing git log resulted in this commit number:

commit a61640173d00b63e0b55ad61915a9b1708e12d27 (grafted, HEAD, tag: v1.57.0)
Author: AJ Heller <[email protected]>
Date:   Tue Aug 8 10:56:15 2023 -0700

    [Release] Bump version to 1.57.0 (on v1.57.x branch) (#34008)
    
    Change was created by the release automation script. See go/grpc-release

@johnousterhout
Copy link
Member

Oops, sorry, my goof. In looking through my sources I see that I added that method myself for debugging a while ago, but forgot. I'm attaching my version of src/core/lib/channel/channel_args.cc (I had to attach it as a .txt file instead of .cc so that GitHub would let it pass); can you drop that into your grpc tree and build with that to run the test?

channel_args.txt

@chwebb02
Copy link
Author

I also had to make a change to the appropriate header file to add a declaration of printNames(). After doing so and running test_server I am not receiving any output before the segmentation fault. I am not sure if this is a mistake on my end in configuring it or something else.

@chwebb02
Copy link
Author

After another attempt, the code does not segfault when using the provided channel_args. The output of running test_server is as follows:

chwebb02@vm0:~/grpc_homa$ ./test_server 
ChannelArgs name: grpc.compression_enabled_algorithms_bitset
ChannelArgs name: grpc.internal.event_engine
ChannelArgs name: grpc.primary_user_agent
ChannelArgs name: grpc.resource_quota
Printing channel arg grpc.compression_enabled_algorithms_bitset
grpc.compression_enabled_algorithms_bitset is an integer: 7
Printing channel arg grpc.internal.event_engine
grpc.internal.event_engine is a pointer
Printing channel arg grpc.primary_user_agent
grpc.primary_user_agent is a string: grpc-c++/1.57.0
Printing channel arg grpc.resource_quota
grpc.resource_quota is a pointer
Server listening on port 4000

@johnousterhout
Copy link
Member

Just double-checking to make sure I understand: the run above (which did not segfault) occurred even when running without debugging? If not, can you run it without debugging?

@chwebb02
Copy link
Author

Yes, the above output is the result from using the provided files and compiling and running without debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants