index.html

<!DOCTYPE html>
<html lang="en" dir="ltr">
    <head>
        <link href="css/style.css" rel="stylesheet" type="text/css" media="all">
            <link rel="script" href="js/script.js">
                <link rel="icon" type="image/x-icon" href="img/favicon.ico">
                    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
                        <title>Taishi Nakamura portfolio</title>
                    </head>
    <body>
        <nav>
            <ul>
              <li><a href="#about">About</a></li>
              <li><a href="#education">Education</a></li>
              <li><a href="#research">Research experience</a></li>
              <li><a href="#industrial">Industrial experience</a></li>
              <li><a href="#publications">Publications</a></li>
              <li><a href="#contact">Contact</a></li>
            </ul>
          </nav>
        <div class="centering-block">
            <div class="centering-block-inner">
                <div class="profile-header">
                    <img src="./img/profile.jpg" alt="Taishi Nakamura" class="profile-image">
                    <div class="profile-info">
                        <h1 id="about">Taishi Nakamura</h1>
                        <div class="text-container">
                            <p>
                                I love training large-scale models and am passionate about scalable neural network architectures. I am fascinated by the computing power that enables this, particularly distributed computing systems. My interests extend to building multimodal systems, developing advanced reasoning capabilities, and creating agents that continually evolve. 
                            </p>     
                        </div>
                    </div>
                </div>

                <h2 id="education"> Education </h2>

                <h3>Oct 2024 - Present &nbsp; Institute of Science Tokyo</h3>
                    <div class="logo-text">
                        <img src="./img/science-tokyo.jpg" alt="Institute of Science Tokyo logo">
                        <div class="text-container">
                            <p>Master of Science in Computer Science</p>
                            <p>Supervisor: Professor <a href="https://www.rio.gsic.titech.ac.jp/en/member/yokota.html">Rio Yokota</a></p>
                            <p><i>Note: Formerly Tokyo Institute of Technology, renamed after merging with Tokyo Medical and Dental University in October 2024</i></p>
                        </div>
                    </div>

                    <h3>Apr 2021 - Sep 2024 &nbsp; Tokyo Institute of Technology</h3>
                    <div class="logo-text">
                        <img src="./img/tokyotech.webp" alt="Tokyo Tech logo">
                        <div class="text-container">
                            <p>Master of Science in Computer Science (Apr 2024 - Sep 2024, continued at Institute of Science Tokyo)</p>
                            <p>Bachelor of Science in Computer Science (Apr 2021 - Mar 2024)</p>
                            <p>Supervisor: Professor <a href="https://www.rio.gsic.titech.ac.jp/en/member/yokota.html">Rio Yokota</a></p>
                            <p>Graduated one year early from the bachelor's program due to outstanding academic performance</p>
                        </div>
                    </div>

                <h2 id="research">Research experience</h2>

                <h3>Jun 2024 - Present &nbsp; RIKEN</h3>
                <div class="logo-text">
                    <img src="./img/riken.jpeg" alt="RIKEN logo">
                    <div class="text-container">
                        <p>
                            Research Assistant
                        </p>
                    </div>
                </div>

                <h3>May 2024 - Present &nbsp; NII</h3>
                <div class="logo-text">
                    <img src="./img/nii.jpeg" alt="Sakana AI logo">
                    <div class="text-container">
                        <p>
                            Research Assistant at National Institute of Informatics Large-Scale Language Model Research and Development Center
                        </p>
                        <p>
                            Mentors: Professor <a href="https://www.fai.cds.tohoku.ac.jp/members/js/">Jun Suzuki</a> and Professor <a href="https://www.rio.gsic.titech.ac.jp/en/member/yokota.html">Rio Yokota</a></p>
                        </p>
                    </div>
                </div>

                <h3>Feb 2024 - Present &nbsp; Sakana AI</h3>
                <div class="logo-text">
                    <img src="./img/sakana-ai.jpeg" alt="Sakana AI logo">
                    <div class="text-container">
                        <p>
                            Research Internship
                        </p>
                        <p>
                            Mentor: Dr.  <a href="https://takiba.net/">Takuya Akiba</a> (Research Scientist)</p>
                        </p>
                    </div>
                </div>

                <h3>Oct 2023 - Apr 2024 &nbsp; LLM-JP</h3>
                <div class="logo-text">
                    <img src="./img/llm-jp.png" alt="llm-jp logo">
                    <div class="text-container">
                        <p>
                            Research Internship
                        </p>
                        <p>
                            Active as a member of the Model Building WG
                        </p>
                        <p>
                            Mentor: Professor <a href="https://www.fai.cds.tohoku.ac.jp/members/js/">Jun Suzuki</a></p>
                        </p>
                    </div>
                </div>

                <h3>Apr 2022 - Aug 2023 &nbsp; A*Quantum</h3>
                <div class="logo-text">
                    <img src="./img/aq.jpeg" alt="A*Quantum logo">
                    <div class="text-container">
                        <p>
                            Research Internship
                        </p>
                    </div>
                </div>

                <h2 id="industrial">Industrial Experience</h2>

                <h3>Jun 2023 - Feb 2024 &nbsp; MITOU TARGET</h3>
                <div class="logo-text">
                    <img src="./img/mitou-tg.png" alt="MITOU TARGET logo">
                    <div class="text-container">
                        <p>Selected for the <a href="https://www.ipa.go.jp/en/it-talents/mitou/target-quantum-computing-2023.html">MITOU TARGET program for Quantum Computing</a>.</p>
                        <p>Developed an educational platform for quantum computing, now available at <a href="https://qualsimu.com/textbook/">Qualsimu Textbook</a>.</p>
                        <p>
                            Mentor: Dr.  <a href="https://www.rd.ntt/e/organization/researcher/special/s_023.html">Yuuki Tokunaga</a></p>
                        </p>
                    </div>
                </div>

                <h3>Feb 2022 - Dec 2022 &nbsp; Crystal Method</h3>
                <div class="logo-text">
                    <img src="./img/crystal.jpeg" alt="crystal method logo">
                    <div class="text-container">
                        <p>
                            Engineering Internship
                        </p>
                    </div>
                </div>

                <h2>Open Source Projects </h2>

                <h3>Oct 2023 - Present &nbsp; Swallow LLM</h3>
                    <div class="logo-text">
                        <img src="./img/tokyotech-llm.png" alt="TokyoTech-LLM logo">
                        <div class="text-container">
                            <p>
                                This project contributes to Japan's AI sovereignty efforts and involves collaboration with esteemed researchers including 
                                Professor <a href="https://www.chokkan.org/">Naoaki Okazaki</a>, 
                                Professor <a href="https://www.rio.gsic.titech.ac.jp/en/member/yokota.html">Rio Yokota</a>,
                                and Dr. <a href="https://sites.google.com/view/hjtakamura">Hiroya Takamura</a>.
                            </p>
                            <p>
                                We invite you to explore our progress:
                                <br>
                                • <a href="https://huggingface.co/tokyotech-llm">View our published models on Hugging Face</a>
                                <br>
                                • <a href="https://swallow-llm.github.io/index.en.html">Learn more about the project on our website</a>
                            </p>
                        </div>     
                    </div>

                <h3>Jun 2023 - Present &nbsp; Ontocord.AI</h3>
                <div class="logo-text">
                    <img src="./img/ontocord.png" alt="Ontocord.AI logo">
                    <div class="text-container">
                        <p>
                            We've developed Aurora-M, a multilingual model designed to address the challenges of low-resource languages. Our latest release, aurora-m-v0.1-biden-harris-redteamed, is the first open-source multilingual model red-teamed against the Biden-Harris Executive Order on AI.
                        </p>
                        <p>
                            Through these projects, we're actively contributing to the open science community, promoting research that aims to reduce illegal and biased AI outputs while enhancing the utility of AI applications across multiple domains and languages.
                        </p>
                        <p>
                            Explore our work: <a href="https://huggingface.co/aurora-m/aurora-m-v0.1">Aurora-M v0.1 on Hugging Face</a>
                        </p>
                    </div>
                </div>

                <h2 id="publications">Publications</h2>

                <h3>COLING '2025 (Industry Track) &nbsp; Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order</h3>
                <div class="paper-info">
                        <p>
                            <b>Taishi Nakamura*</b>, Mayank Mishra*, Simone Tedeschi*, ..., Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo (49 authors)
                            <a href="https://arxiv.org/abs/2404.00399" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                </div>

                <h3>COLM '2024 &nbsp; Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities</h3>
                <div class="paper-info">
                        <p>
                            Kazuki Fujii∗, <b>Taishi Nakamura∗</b>, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Hirai Shota, Sakae Mizuki, Rio Yokota, Naoaki Okazaki
                            <a href="https://arxiv.org/abs/2404.17790" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                        <p>
                            Acceptance Rate: 28.8%
                        </p>
                </div>

                <h3>COLM '2024 &nbsp; Building a Large Japanese Web Corpus for Large Language Models</h3>
                <div class="paper-info">
                        <p>
                            Naoaki Okazaki, Kakeru Hattori, Hirai Shota, Hiroki Iida, Masanari Ohi, Kazuki Fujii, <b>Taishi Nakamura</b>, Mengsay Loem, Rio Yokota, Sakae Mizuki
                            <a href="https://arxiv.org/abs/2404.17733" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                        <p>
                            Acceptance Rate: 28.8%
                        </p>
                </div>

                <h2>Preprints</h2>

                <h3>2024 &nbsp; Agent Skill Acquisition for Large Language Models via CycleQD</h3>
                <div class="paper-info">
                        <p>
                            So Kuroki, <b>Taishi Nakamura</b>, Takuya Akiba, Yujin Tang
                        </p>
                        <p>
                            <a href="https://arxiv.org/pdf/2410.14735" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                </div>

                <h3>2024 &nbsp; LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs</h3>
                <div class="paper-info">
                        <p>
                            LLM-jp: Akiko Aizawa, Eiji Aramaki, ..., <b>Taishi Nakamura</b>, ..., Koichiro Yoshino (79 authors)
                        </p>
                        <p>
                            <a href="https://arxiv.org/abs/2407.03963" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                        <p>
                            Note: Authors are listed in alphabetical order.
                        </p>
                </div>

                <h2>Domestic Conference in Japan</h2>

                <h3>IPSJ 2024 &nbsp; LLMに日本語テキストを学習させる意義</h3>
                <div class="paper-info">
                        <p>
                            齋藤 幸史郎, 水木 栄, 大井 聖也, <b>中村 泰士</b>, 塩谷 泰平, 前田 航希, Ma Youmi, 服部 翔, 藤井 一喜, 岡本 拓己, 石田 茂樹, 高村 大也, 横田 理央, 岡崎 直観
                            <a href="https://sites.google.com/sig-nl.ipsj.or.jp/sig-nl/%E7%A0%94%E7%A9%B6%E7%99%BA%E8%A1%A8%E4%BC%9A/NL261" target="_blank"><i class="fa fa-external-link" aria-hidden="true"></i></a>
                        </p>
                        <span>🎉 Best Paper Award (6.7%).</span>
                </div>

                <h3>IPSJ 2024 &nbsp; 継続学習を用いた効率の良いマルチリンガル・マルチエキスパートモデルの開発</h3>
                <div class="paper-info">
                        <p>
                            <b>中村泰士</b>, 横田理央
                            <a href="https://www.ipsj.or.jp/event/taikai/86/WEB/data/pdf/5J-01.html" target="_blank"><i class="fa fa-external-link" aria-hidden="true"></i></a>
                        </p>
                </div>

                <h3>NLP 2024 &nbsp; 継続事前学習による日本語に強い大規模言語モデルの構築</h3>
                <div class="paper-info">
                        <p>
                            藤井一喜∗, <b>中村泰士∗</b>, Mengsay Loem, 飯田大貴, 大井聖也, 服部翔, 平井翔太, 水木栄, 横田理央, 岡崎直観
                            <a href="https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A8-5.pdf" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                        <p>
                            <a href="https://www.anlp.jp/nlp2024/award.html#outstanding" target="_blank">
                                <span>🎉 Outstanding Papers (2.0%).</span>
                            </a>
                        </p>
                </div>

                <h3>NLP 2024  &nbsp; 大規模言語モデルの日本語能力の効率的な強化: 継続事前学習における語彙拡張と対訳コーパスの活用</h3>
                <div class="paper-info">
                        <p>
                            水木栄*, 飯田大貴*, 藤井一喜, <b>中村泰士</b>, Mengsay Loem, 大井聖也, 服部翔, 平井翔太, 横田理央, 岡崎直観
                            <a href="https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A6-4.pdf" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                </div>

                <h3>NLP 2024 &nbsp; Swallowコーパス: 日本語大規模ウェブコーパス</h3>
                <div class="paper-info">
                        <p>
                            岡崎直観, 服部翔, 平井翔太, 飯田大貴, 大井聖也, 藤井一喜, <b>中村泰士</b>, Mengsay Loem, 横田理央, 水木栄
                            <a href="https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A6-1.pdf" target="_blank"><i class="fa fa-file-pdf-o" aria-hidden="true"></i></a>
                        </p>
                        <a href="https://www.anlp.jp/nlp2024/award.html#outstanding" target="_blank">
                            <span>🎉 Outstanding Papers (2.0%).</span>
                        </a>
                </div>

                <h2 id="contact"> Contact </h2>
                <a href="https://twitter.com/Setuna7777_2" target="_blank" rel="noreferrer" class="sns"><img src="https://raw.githubusercontent.com/danielcranney/readme-generator/main/public/icons/socials/twitter.svg" width="32" height="32" /></a>
                <a href="https://www.github.com/Taishi-N324" target="_blank" rel="noreferrer"><img src="https://raw.githubusercontent.com/danielcranney/readme-generator/main/public/icons/socials/github.svg" width="32" height="32" /></a>
                <a href="https://www.linkedin.com/in/taishi-nakamura" target="_blank" rel="noreferrer"><img src="https://raw.githubusercontent.com/danielcranney/readme-generator/main/public/icons/socials/linkedin.svg" width="32" height="32" /></a>                 
                <a href="https://scholar.google.com/citations?hl=en&user=nbPQwgUAAAAJ" target="_blank" rel="noreferrer">
                    <img src="./img/google-scholar.svg" width="32" height="32" alt="Google Scholar" />
                </a>
                <P></p>
                <br>Last updated: October 2024
            </div>
        </div>
    </body>
</html>