Untold Stories of Intellectual Property: 인공지능 기반 특허 분석 모델 - 최신 특허 출원 소개 (AI-Based Patent Analysis Model - Introduction to the Latest Patent Application)

Wednesday, January 1, 2025

인공지능 기반 특허 분석 모델 - 최신 특허 출원 소개 (AI-Based Patent Analysis Model - Introduction to the Latest Patent Application)

AI-Based Patent Analysis Model - Introduction to the Latest Patent Application

Hello! In this post, I am excited to introduce the latest patent technology I developed during my master's program in artificial intelligence.

This invention presents an innovative methodology focused on automating and improving the accuracy of patent analysis.

The invention was provisionally filed on July 18, 2023 (Application No.: 10-2023-0093439) and officially filed on June 10, 2024 (Application No.: 10-2024-0075102). Here, I share the key details of this invention.

Patent Overview

Filing Date: June 10, 2024
Application No.: 10-2024-0075102
Title: Method for Generating Patent Analysis Models Using Artificial Neural Networks and Text Pair Embeddings, Patent Analysis Methods, and Computing Devices
Priority Claim: 10-2023-0093439 (July 18, 2023)

Motivation for the Invention

Traditional patent analysis methods are labor-intensive and struggle to reflect the complexity of legal interpretations.

In particular, accurately identifying relationships between patent claims and descriptions posed challenges, leading to unreliable assessments of infringement and similarity. This invention was developed to address these issues.

Limitations of Background Technology

Lack of Linguistic Adaptation:
- Existing Natural Language Processing (NLP) models fail to capture the complex structure and legal semantics of patent documents.
Legal Interpretation Challenges:
- The inability to reflect legal nuances in claim interpretation results in errors in similarity judgments.
Efficiency Issues:
- Manual reliance on experts for data preparation leads to excessive costs and time.
Performance Constraints:

Existing models struggle to capture contextual understanding and assess similarity accurately.

Proposed Solutions

1. Training Methods

Text Pair Generation and Labeling
- Extract text sequences from claims and descriptions.
- Label pairs as similar (1) if extracted from the same document and dissimilar (0) if from different documents.
- Automatically generate labeled training data without manual classification of infringement or similarity.

Pre-training and Transfer Learning
- Utilize transformer-based language models such as BERT.
- Leverage large-scale patent datasets to enhance contextual understanding.

Text Embedding Optimization
- Tokenize text sequences and convert them into vector embeddings.
- Optimize performance using cross-entropy loss functions.
- Achieved loss below 19% within 15 epochs, reaching 81% accuracy.

Batch Learning and Ensemble Learning
- Prevent dataset bias through cross-learning between similar and dissimilar groups.

2. Inference Methods

Similarity Analysis:
- Analyze similarity by embedding text pairs of patent claims and target inventions.
Infringement Analysis:
- Evaluate the similarity between product descriptions and claims to assess infringement.
Patentability Analysis:
- Assess the similarity between claims and prior art descriptions.
Classification Optimization:
- Use binary classifiers to automate patentability verification and infringement analysis.

Key Benefits

Automation and Efficiency Improvement:
- Rapidly process large-scale training datasets without manual preparation.
Enhanced Accuracy:
- Improve reliability by reflecting contextual meaning and legal nuances in claims and descriptions.
- Verified accuracy through 100 invention tests.
Scalability:
- Apply optimized hardware (neuromorphic computing) and software (BERT-based models) for large-scale analysis.
Cost Reduction:
- Reduce reliance on experts, lowering costs and accelerating analysis speed.
Legal Interpretation Support:
- Assist in patent registration, infringement lawsuits, and novelty verification.

Conclusion

This invention addresses limitations in traditional methods by introducing an AI-powered patent analysis model that automates similarity and infringement analysis. By optimizing text pair embeddings and neural network training, this solution effectively interprets complex patent documents and can be broadly applied in legal and R&D fields.

Future plans include validating the performance of this technology across various fields and continuously enhancing its performance and application scope. I look forward to receiving feedback and engaging in discussions with interested readers!

인공지능 기반 특허 분석 모델 - 최신 특허 출원 소개

안녕하세요! 이번 포스트에서는 제가 인공지능 석사과정 중에 개발한 최신 특허 기술을 소개합니다. 이 발명은 특허 분석의 자동화와 정확도 향상에 초점을 맞춘 혁신적인 방법론을 제시합니다. 해당 발명은 2023년 7월 18일 가출원(출원번호: 10-2023-0093439)을 거쳐, 2024년 6월 10일에 정규출원(출원번호: 10-2024-0075102)되었습니다. 이제 그 핵심 내용을 공개합니다.

특허 개요

출원일: 2024년 6월 10일
출원번호: 10-2024-0075102
발명의 명칭: 인공신경망 및 텍스트 쌍 임베딩을 이용한 특허 분석 모델의 생성 방법, 특허 분석 방법 및 컴퓨팅 장치
우선권 주장: 10-2023-0093439 (2023년 7월 18일)

발명의 동기

기존의 특허 분석 방법은 노동 집약적이며 법률적 해석의 법리를 반영하기 어렵다는 문제가 있었습니다. 특히 특허 청구범위와 발명 설명 간의 관계를 정확히 파악하지 못해 신뢰성 있는 특허 침해 및 유사성 판단이 어려웠습니다. 이러한 문제를 해결하기 위해 본 발명이 개발되었습니다.

기존 기술의 한계

언어적 특수성 반영 부족:
- 기존 자연어 처리(NLP) 모델은 특허 문서의 복잡한 구조와 법률적 의미를 충분히 반영하지 못함.
청구범위 해석의 법률적 한계:
- 청구범위 해석의 법률적 특수성을 반영하지 못하여 유사성 판단 오류 발생.
효율성 부족:
- 학습 데이터 생성 작업에 전문가의 수작업 의존으로 비용과 시간이 과다 소요.
성능 한계:
- 임베딩 과정에서 정확한 문맥 이해와 유사성 판단이 어려움.

해결 방법

1. 학습 방법

텍스트 쌍 생성 및 레이블링
- 청구범위(Claims)와 발명 설명(Description)에서 텍스트 시퀀스를 추출.
- 동일 문서에서는 유사(1), 다른 문서에서는 비유사(0)로 자동 레이블링.
- 침해 여부나 유사 여부를 수작업으로 분류하지 않고 자동으로 레이블링된 학습 데이터 생성.
사전 학습 및 전이학습 적용
- BERT와 같은 트랜스포머 기반 언어모델을 활용.
- 대규모 특허 문서 데이터를 통해 문맥 이해 성능 강화.
텍스트 임베딩 최적화
- 텍스트 시퀀스를 토큰화하고 벡터 임베딩으로 변환.
- 교차 엔트로피 손실 함수를 사용하여 성능 최적화.
- 15 에포크 내 손실 19% 이하, 정확도 81% 달성.
배치 학습 및 앙상블 학습 적용
- 데이터셋 편향 방지를 위해 유사/비유사 그룹 교차 학습.

2. 추론 방법

유사성 판단:
- 특허 청구범위와 추론 대상 발명을 텍스트 쌍으로 임베딩하여 유사성 분석.
침해 분석:
- 제품 설명과 청구범위 유사도를 평가하여 침해 여부 판단.
특허성 분석:
- 청구범위와 선행발명 설명 간 유사도를 평가.
분류 작업 최적화:
- 유사성 여부를 이진 분류기로 판단하여 특허 검증 및 침해 여부 분석 자동화.

발명의 효과

자동화 및 효율성 향상:
- 학습 데이터 생성의 수작업 없이 대규모 분석 데이터 신속 처리.
정확도 강화:
- 청구범위와 설명의 문맥적 의미 및 법률적 특수성 반영으로 신뢰성 향상.
- 100건의 발명에 대한 정확도 검증 완료.
확장성:
- 하드웨어(뉴로모픽 컴퓨팅) 및 소프트웨어(BERT 기반 모델) 최적화를 통한 대규모 분석 작업 적용 가능.
비용 절감:
- 전문가 의존도 감소로 비용 절감 및 속도 향상.
법률 해석 지원:
- 특허 등록, 침해 소송, 신규성 검증 등 다양한 법률 검토에 활용 가능.

결론

본 발명은 인공지능 기반 특허 분석 모델을 통해 기존 기술의 한계를 극복하고, 특허 침해 여부 및 유사성 분석을 자동화합니다. 텍스트 쌍 임베딩과 신경망 학습 최적화로 복잡한 특허 문서 해석 문제를 해결하며, 법률 및 연구개발 분야에서 폭넓게 활용될 수 있는 솔루션을 제공합니다.

앞으로도 이 기술의 성능을 다양한 분야에서 검증하고, 지속적인 성능 향상과 적용 범위를 확대할 계획입니다. 관심 있는 분들의 많은 피드백과 토론을 기대합니다!

Untold Stories of Intellectual Property