Naver Corporation

10/18/2024 | Press release | Archived content

NAVER Demonstrates World-Class Search Technology with Papers Accepted at Global NLP Conference

Copy text Download all images

NAVER Demonstrates World-Class Search Technology with Papers Accepted at Global NLP Conference

- Search technology papers to be presented at "EMNLP 2024," held on November 12-16 in Florida, USA

- Research findings improve NAVER's search performance: Enhancing result relevance in "CUE:," advancing automatic information extraction in "Knowledge Snippet," and improving long-tail query accuracy

- NAVER Search accelerates quality improvements by expanding the indexing scale by 50% to address diversifying user search needs and enhancing high-reliability content emphasis techniques

October 18, 2024

NAVER Corporation (CEO Choi Soo-yeon) has proven its world-class search technology capabilities by having papers on search technology accepted at "Empirical Methods in Natural Language Processing (EMNLP) 2024," one of the most prestigious global natural language processing (NLP) conferences.

Now in its 28th year, EMNLP is considered one of the top AI conferences in the field of NLP, along with the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) and the Association for Computational Linguistics (ACL). The conference covers a wide range of research on language data-based NLP approaches, including AI translation, chatbots, and machine reading comprehension. EMNLP 2024 will be held from November 12 to 16 in Florida, USA, and NAVER will present four accepted papers, including one on its search technology research.

Notably, NAVER has directly and indirectly applied this research to its search services, improving both the quality and usability of the search, thereby adding value and demonstrating the effectiveness of the research.

One of the accepted papers discusses the algorithm applied to NAVER's generative AI search service, "CUE:." This research covers a modular approach using small language models (SLMs) to detect harmful queries and provide appropriate responses. NAVER applied these research findings to "CUE:" in November of last year to enhance AI stability. For example, the system identifies queries related to illegal content such as crime or harmful information, copyright infringement, privacy violations, personal data leaks, and profanity, ensuring that indiscriminate answers are not provided, thereby creating a safer generative AI search environment. In the future, NAVER plans to use this technology to improve the relevance of search results, expand exposure to high-quality content, and prioritize answers from reliable sources to enhance overall search service quality.

NAVER has also proposed a technology for its "Knowledge Snippet" service, which provides summaries of key information related to search terms at the top of integrated search results. This technology allows AI to effectively process not only text but also complex snippet formats such as lists and tables. The technology is expected to be applied to "Knowledge Snippet" in the first half of next year, increasing the accuracy of answers to long-tail queries (long and complex search terms) and helping users quickly find the information they need.

In addition, a paper on a method for applying document ranking capabilities of a large language model (LLM) to a small large language model (sLLM) in search services was accepted. This technology was devised and proposed to offer LLM-level quality without slowing down the search service, which needs to provide real-time results that users want. In June, NAVER applied the model introduced in the paper to its integrated search service, enabling it to provide more contextually relevant documents for long-tail queries. Following the implementation, the document click-through rate (CTR) increased by 4.3%, and the time spent on the page increased by 3%.

Moreover, NAVER Search has published papers on search technology in prestigious AI conferences this year, including one at NAACL, two at CVPR, as well as one paper each at Information Sciences, LREC-COLING, and SIGIR/LLM4Eval, along with EMNLP. Seven papers were also accepted at the 36th Annual Conference on Human and Cognitive Language Technology (HCLT), Korea's most prestigious academic conference in this field, with two of them being selected as outstanding papers, showcasing NAVER's high level of search technology.

Kim Kwang-hyun, Head of NAVER's Search/Data Platform Division, said, "This research not only solidifies NAVER's leadership in the domestic search market but also demonstrates our capabilities on the global stage." He added, "We will continue to offer competitive search services optimized for users by further enhancing search accuracy and experimenting with generative AI.

Meanwhile, NAVER has been continuously enhancing its technology and infrastructure to meet the evolving search needs of users by expanding its web search index by 50% based on enhanced computing power and emphasizing trustworthy content using AI. In August, NAVER developed its machine learning method for recognizing reliable documents and improved its ranking system to better evaluate the trustworthiness and expertise of document sources.

[Reference] List of Accepted Papers at EMNLP 2024

(1-3: Search technology-related, 4: Generative AI research-related)

1. SLM as Guardian: Pioneering AI Safety with SLMs

- Ohjoon Kwon, Donghyeon Jeon, Nayoung Choi, Gyu-Hwung Cho, Changbong Kim, Hyunwoo Lee, Inho Kang, Sun Kim, Taiwoo Park

2. Hyper-QKSG: Framework for Automating Query Generation and Knowledge-Snippet Extraction from Tables and Lists

- Dooyoung Kim, Yoonjin Jang, Dongwook Shin, Chanhoon Park and Youngjoong Ko

3. RRA Distill: Distilling LLM's Passage Ranking Ability for Long-Tail Queries Document Re-Ranking on a Search Engine

- Nayoung Choi, Youngjune Lee, Gyu-Hwung Cho, Haeyu Jeong, Jungmin Kong, Saehun Kim, Keunchan Park, Jaeho Choi, Sarah Cho, Inchang Jeong, Gyohee Nam, Sunghoon Han, Wonil Yang

4. Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback

- Fatemeh Pesaran Zadeh, Juyeon Kim, Jin-Hwa Kim, and Gunhee Kim

Copy text Download all images Read More