[AI] SISC2-37 [FEAT] transform 학습 파일 저장기능 구현 #71

twq110 · 2025-11-01T11:46:56Z

일부 절대 경로 상대경로로 변경
finder 모듈 project_root 순서 변경
ollma 연결 보조 함수 추가 및 서버 헬스체크 코드 추가

기술지표 유틸 외부로 변경
transformer 학습/추론으로 나누고 학습 데이터를 weights에 저장하도록 변경
테스트 코드 작성

Summary by CodeRabbit

릴리스 노트

New Features
- LLM 클라이언트 연결/모델 검증 기능 추가
- Transformer 기반 분류 모델의 학습·추론 파이프라인 및 피처 생성·추론 모듈 도입
Bug Fixes
- 파이프라인 전반의 오류 처리 및 복원력 강화 (파일, 네트워크, 초기화 실패 등 처리)
Tests
- 데이터베이스 연결 확인 스크립트 및 실데이터 기반 변환/추론 테스트 추가
Chores
- 저장소 무시 규칙 및 일부 구성 파일 인코딩/의존성 표기 정리

coderabbitai · 2025-11-01T11:47:05Z

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

📥 Commits

Reviewing files that changed from the base of the PR and between 014a9bd and e355350.

📒 Files selected for processing (2)

AI/tests/quick_db_check.py (1 hunks)
AI/transformer/__init__.py (1 hunks)

 ____________________________________________________________
< Go forth and channel your goroutines; I’ll guard the race. >
 ------------------------------------------------------------
  \
   \   (\__/)
       (•ㅅ•)
       / 　 づ

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting in your project's settings in CodeRabbit to automatically approve the review once all CodeRabbit's comments are resolved.

Walkthrough

트랜스포머 모듈을 모듈화해 추론/학습 파이프라인을 분리하고, Ollama LLM 클라이언트 중앙화(get_ollama_client), Finder의 예외 처리 강화, import 경로 정리, 새로운 유틸리티·테스트 스크립트 및 학습 인프라 추가 등이 포함된 리팩토링입니다.

Changes

코호트 / 파일(s)	변경 요약
저장소 설정/인코딩 `\`.gitignore``,` AI/configs/config.json`,` AI/requirements.txt`	`.gitignore`에 `env`와 `/.vs` 패턴 추가; `AI/configs/config.json`과 `AI/requirements.txt`에 BOM(인코딩 흔적) 삽입(내용은 동일); `AI/requirements.txt`에 `pathlib` 항목 추가
LLM 클라이언트 인프라 `AI/libs/llm_clients/ollama_client.py`, `AI/libs/llm_clients/__init__.py`	Ollama 서버 헬스체크와 모델 가용성 확인을 수행하는 `get_ollama_client()` 도입 및 패키지 레벨 재내보내기 추가
Finder 강화 `AI/finder/main.py`	직접 Ollama 인스턴스화 → `get_ollama_client()` 사용으로 교체; LLM 초기화, CSV 로드, 뉴스 요약 호출, Top3 계산 등 주변에 예외 처리 및 상세 로깅 추가; 실패 시 빈 결과 반환 경로 추가
파이프라인·유틸리티 변경 `AI/libs/core/pipeline.py`, `AI/libs/utils/__init__.py`, `AI/libs/utils/fetch_ohlcv.py`	import 경로 조정(AI.* 접두사 축소/상대화); `run_weekly_finder`가 하드코딩 티커 목록 사용(테스트용); 파이프라인에서 config 파일 로드/검증 추가 및 단계별 가드; `fetch_ohlcv` 쿼리에 `ticker`, `adjusted_close` 포함 및 반환 스키마 확장
트랜스포머 진입점 리팩토링 `AI/transformer/main.py`, `AI/transformer/__init__.py`	내부 전체 추론 로직 제거 후 `modules.inference.run_inference`로 위임; `run_transformer` 서명에서 `seq_len`, `pred_h` 순서를 변경하고 가중치 경로 계산 추가
트랜스포머 모듈 추가 `AI/transformer/modules/features.py`, `AI/transformer/modules/inference.py`	`features.py`: 다양한 기술지표 계산 및 `build_features()` 추가; `inference.py`: 시퀀스 생성·스케일링·모델 로드·예측·폴백 로직을 포함한 엔드투엔드 추론 파이프라인 추가(`CLASS_NAMES`, `run_inference`)
트랜스포머 학습 인프라 `AI/transformer/training/__init__.py`, `AI/transformer/training/train_transformer.py`	Transformer 분류 학습 파이프라인 추가(라벨링, 시퀀스 빌드, 스케일러, 학습 루프, Yahoo Finance 데이터 수집, CLI 진입점 등)
테스트·유틸 스크립트 `AI/tests/quick_db_check.py`, `AI/tests/test_transfomer.py`	PostgreSQL 연결 검증 스크립트 추가; 실시간 OHLCV로 트랜스포머 흐름을 검증하는 라이브 테스트(재시도 래퍼 포함) 추가

Sequence Diagram(s)

sequenceDiagram
    participant Finder as Finder (AI/finder/main.py)
    participant LLM as llm_clients.get_ollama_client()
    participant Ollama as Ollama Server

    Finder->>LLM: get_ollama_client()
    LLM->>Ollama: GET /api/tags (health)
    alt 서버 응답
        Ollama-->>LLM: models list
        LLM->>LLM: 모델 존재 여부 확인
        alt 모델 존재
            LLM-->>Finder: Ollama 인스턴스 반환
            Finder->>Finder: LLM 초기화 후 작업 진행
        else 모델 없음
            LLM-->>Finder: RuntimeError (model not available)
            Finder->>Finder: 에러 로깅 및 빈 결과 반환
        end
    else 연결 실패
        Ollama-->>LLM: Connection Error
        LLM-->>Finder: RuntimeError (connection failed)
        Finder->>Finder: 에러 로깅 및 빈 결과 반환
    end

sequenceDiagram
    participant Pipeline as Pipeline (AI/libs/core/pipeline.py)
    participant Config as Config File
    participant Finder as Finder
    participant Transformer as Transformer (modules/inference)
    participant DB as Database

    Pipeline->>Config: load configs/config.json
    Config-->>Pipeline: config dict
    Pipeline->>Finder: run_weekly_finder() (하드코딩 티커)
    Finder-->>Pipeline: top_tickers
    alt Finder 성공
        Pipeline->>Transformer: run_transformer(...) -> run_inference
        Transformer->>Transformer: build_features() -> sequences -> scale -> predict
        Transformer-->>Pipeline: logs DataFrame
        Pipeline->>DB: save results
        DB-->>Pipeline: OK
    else Finder 실패
        Pipeline->>Pipeline: 경고 로깅 및 중단
    end

sequenceDiagram
    participant Train as Training (train_transformer.py)
    participant Yahoo as Yahoo Finance
    participant Features as Features (features.py)
    participant Model as Transformer Model

    Train->>Yahoo: _fetch_yahoo_ohlcv(ticker, range)
    Yahoo-->>Train: OHLCV 데이터
    Train->>Features: build_features(df)
    Features-->>Train: feature DataFrame
    Train->>Train: _label_by_future_return(), _build_sequences(), _fit_scaler_on_train()
    Train->>Model: build_transformer_classifier()
    Model-->>Train: model instance
    Train->>Model: model.fit(X_train, y_train)
    Model-->>Train: training history
    Train->>Train: weights & scaler 저장

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50분

주된 검토 포인트:
- AI/transformer/modules/inference.py: 시퀀스 생성·스케일링·폴백 로직의 정확성 및 경계 조건
- AI/transformer/training/train_transformer.py: 라벨링 기준, 데이터 수집(시간대/빈도), 스케일링 및 학습 재현성
- AI/transformer/main.py: public API(함수 시그니처) 변경에 따른 호출자 영향 범위
- AI/finder/main.py: LLM 초기화 실패 및 외부 호출(뉴스/CSV) 경로의 예외 처리 및 로그 유효성
- AI/libs/core/pipeline.py: import 경로 변경과 하드코딩된 테스트 티커가 파이프라인 동작에 미치는 영향

Possibly related PRs

[AI] SISC2-36 [FEAT] 최종 리포트 저장 구현 #69 — 파이프라인, 트랜스포머, 유틸리티(fetch_ohlcv/get_db_conn) 및 config 처리 변경과 파일/함수 수준에서 겹침.
[AI] SISC2-35 [FIX] AI 파트 통합 파이프라인 #61 — AI/libs/core/pipeline.py의 파이프라인 오케스트레이션 관련 변경과 직접적 연관.
[AI] SISC2-20 [FEAT] 주가 뉴스 기반 종목 추천 llm 활용 #48 — AI/finder/main.py의 LLM 초기화 및 뉴스/안정성 통합 변경과 코드 경로 중복.

Suggested reviewers

Kosw6

시

🐰 모듈들 모여 춤을 추네, 헬스체크로 문을 두드려,
시퀀스는 정리되고 특성은 반짝이며,
학습과 추론은 새 길을 찾아,
파이프라인은 조용히 모든 것을 잇네. ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	PR 제목은 "[AI] SISC2-37 [FEAT] transform 학습 파일 저장기능 구현"이며, 이는 변경 사항의 일부인 Transformer 학습 기능 및 모델 저장과 관련이 있습니다. 실제로 변경 사항에는 train_transformer.py, inference.py, features.py 등 Transformer 모듈의 상당한 리팩토링이 포함되어 있습니다. 그러나 PR 목표에 명시된 여러 주요 변경 사항이 제목에 반영되지 않았습니다: 절대 경로를 상대 경로로 변환, Ollama 연결 도우미 함수 추가, 기술 지표 유틸리티 추출, 파이프라인 리팩토링 등입니다. 제목이 변경 사항의 실제 측면을 포함하지만 전체 범위를 충분히 대표하지 못합니다.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

AI/configs/config.json (1)
2-18: 하드코딩된 데이터베이스 자격증명을 환경변수로 이동하세요.

데이터베이스 호스트, 사용자명, 비밀번호가 코드에 직접 노출되어 있습니다. 이는 심각한 보안 위험입니다. 자격증명을 환경변수나 보안 vault로 이동해야 합니다.

예시:
import os
config = {
    "db": {
        "host": os.environ.get("DB_HOST"),
        "user": os.environ.get("DB_USER"),
        "password": os.environ.get("DB_PASSWORD"),
        "dbname": os.environ.get("DB_NAME"),
        "port": int(os.environ.get("DB_PORT", 5432))
    }
}
또는 .env 파일을 사용하고 .gitignore에 추가하세요.
AI/libs/core/pipeline.py (2)
35-56: DB 설정 부재 시 즉시 중단하지 않아 KeyError가 발생합니다

configs/config.json이 없거나 db 섹션이 비어 있으면 db_config는 {}가 되고, 이후 fetch_ohlcv()→get_db_conn() 호출에서 'host' KeyError가 나며 파이프라인이 바로 죽습니다. 설정이 비어 있을 때는 Transformer 단계를 진행하지 말고 안전하게 종료하도록 방어 로직을 넣어주세요.
     try:
         with open(os.path.join(project_root, 'configs', 'config.json'), 'r') as f:
             config = json.load(f)
     except FileNotFoundError:
         print("Config file not found")
-    except json.JSONDecodeError:
-        print("Invalid JSON format in config file")
-    db_config = (config or {}).get("db", {})   # ★ db 섹션만 추출
+        return pd.DataFrame()
+    except json.JSONDecodeError:
+        print("Invalid JSON format in config file")
+        return pd.DataFrame()
+    db_config = (config or {}).get("db", {})   # ★ db 섹션만 추출
+    if not db_config:
+        print("DB 설정이 없어 Transformer 모듈을 실행할 수 없습니다.")
+        return pd.DataFrame()
125-141: report_DB 설정이 없을 때도 커넥션을 열어 예외가 납니다

config.json이 없거나 report_DB 섹션이 비어 있으면 여기서도 get_db_conn()이 'host' KeyError를 일으켜 파이프라인이 중단됩니다. DB 설정이 없을 때는 저장 단계를 건너뛰도록 가드해 주세요.
     db_config = config.get("report_DB", {})
+    if not db_config:
+        print("[WARN] report_DB 설정이 없어 리포트를 저장하지 않습니다.")
+        return
     conn = get_db_conn(db_config)

🧹 Nitpick comments (3)

AI/libs/llm_clients/ollama_client.py (2)
21-33: 너무 광범위한 예외 처리를 개선하세요.

Line 32에서 모든 예외를 포착하고 있어 예상치 못한 오류를 숨길 수 있습니다. 구체적인 예외 타입을 지정하는 것이 좋습니다.

다음과 같이 개선할 수 있습니다:
-    except Exception:
+    except (requests.exceptions.RequestException, KeyError, ValueError, TypeError):
         return False
35-68: 예외 메시지 처리 개선을 고려하세요.

현재 긴 예외 메시지가 함수 내부에 하드코딩되어 있습니다. 유지보수성을 위해 메시지를 상수로 분리하거나 커스텀 예외 클래스를 사용하는 것을 고려해보세요.

예시:
class OllamaConnectionError(RuntimeError):
    def __init__(self, base_url: str):
        msg = (
            f"[연결 실패] Ollama 서버에 접속할 수 없습니다. llama3.2 설치 여부 확인해주세요.\n"
            f"- base_url: {base_url}\n"
            f"- 조치: (1) 'ollama serve' 실행 여부 확인 (2) 방화벽/프록시 (NO_PROXY=localhost,127.0.0.1) (3) 11434 포트 개방\n"
            f"- 테스트: curl {base_url}/api/tags"
        )
        super().__init__(msg)
AI/tests/test_transfomer.py (1)
100-100: 코드 간소화 가능 (선택적).

config는 이미 91번 라인에서 빈 딕셔너리로 초기화되었으므로 config or {}는 불필요합니다.

다음과 같이 간소화할 수 있습니다:
-    db_config = (config or {}).get("db", {})  # fetch_ohlcv에 그대로 넘김
+    db_config = config.get("db", {})  # fetch_ohlcv에 그대로 넘김

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f80909a and 5cbb3c7.

⛔ Files ignored due to path filters (1)

transformer/scaler/scaler.pkl is excluded by !**/*.pkl

📒 Files selected for processing (17)

.gitignore (1 hunks)
AI/configs/config.json (1 hunks)
AI/finder/main.py (1 hunks)
AI/libs/core/pipeline.py (3 hunks)
AI/libs/llm_clients/__init__.py (1 hunks)
AI/libs/llm_clients/ollama_client.py (1 hunks)
AI/libs/utils/__init__.py (1 hunks)
AI/libs/utils/fetch_ohlcv.py (2 hunks)
AI/requirements.txt (2 hunks)
AI/tests/quick_db_check.py (1 hunks)
AI/tests/test_transfomer.py (1 hunks)
AI/transformer/__init__.py (1 hunks)
AI/transformer/main.py (2 hunks)
AI/transformer/modules/features.py (1 hunks)
AI/transformer/modules/inference.py (1 hunks)
AI/transformer/training/__init__.py (1 hunks)
AI/transformer/training/train_transformer.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (10)

AI/libs/llm_clients/__init__.py (1)

AI/libs/llm_clients/ollama_client.py (1)

get_ollama_client (35-68)

AI/transformer/training/__init__.py (1)

AI/transformer/training/train_transformer.py (1)

train_transformer_classifier (111-286)

AI/tests/test_transfomer.py (2)

AI/transformer/main.py (1)

run_transformer (15-82)

AI/libs/utils/fetch_ohlcv.py (1)

fetch_ohlcv (7-41)

AI/transformer/training/train_transformer.py (2)

AI/transformer/modules/features.py (1)

build_features (90-131)

AI/transformer/modules/models.py (1)

build_transformer_classifier (16-44)

AI/finder/main.py (3)

AI/libs/llm_clients/ollama_client.py (1)

get_ollama_client (35-68)

AI/libs/utils/news_processing.py (1)

get_weekly_news_summary (163-205)

AI/finder/ticker_selector.py (1)

select_top_stocks (61-156)

AI/libs/utils/__init__.py (2)

AI/libs/utils/fetch_ohlcv.py (1)

fetch_ohlcv (7-41)

AI/libs/utils/get_db_conn.py (1)

get_db_conn (3-12)

AI/libs/core/pipeline.py (3)

AI/transformer/main.py (1)

run_transformer (15-82)

AI/libs/utils/fetch_ohlcv.py (1)

fetch_ohlcv (7-41)

AI/libs/utils/get_db_conn.py (1)

get_db_conn (3-12)

AI/transformer/modules/inference.py (2)

AI/transformer/modules/models.py (1)

build_transformer_classifier (16-44)

AI/transformer/modules/features.py (1)

build_features (90-131)

AI/transformer/main.py (1)

AI/transformer/modules/inference.py (1)

run_inference (48-198)

AI/libs/utils/fetch_ohlcv.py (1)

AI/libs/utils/get_db_conn.py (1)

get_db_conn (3-12)

🪛 Ruff (0.14.2)

AI/tests/quick_db_check.py

42-42: String contains ambiguous ℹ (INFORMATION SOURCE). Did you mean i (LATIN SMALL LETTER I)?

(RUF001)

44-44: Do not catch blind exception: Exception

(BLE001)

AI/tests/test_transfomer.py

65-65: Consider moving this statement to an else block

(TRY300)

71-71: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

97-97: Do not catch blind exception: Exception

(BLE001)

141-141: Avoid specifying long messages outside the exception class

(TRY003)

149-149: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

AI/libs/llm_clients/ollama_client.py

17-17: Consider moving this statement to an else block

(TRY300)

31-31: Consider moving this statement to an else block

(TRY300)

32-32: Do not catch blind exception: Exception

(BLE001)

49-54: Avoid specifying long messages outside the exception class

(TRY003)

57-61: Avoid specifying long messages outside the exception class

(TRY003)

AI/transformer/training/train_transformer.py

82-82: Unused function argument: feats

(ARG001)

155-155: Avoid specifying long messages outside the exception class

(TRY003)

162-162: Avoid specifying long messages outside the exception class

(TRY003)

165-165: Avoid specifying long messages outside the exception class

(TRY003)

227-227: Avoid specifying long messages outside the exception class

(TRY003)

282-282: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

283-283: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

299-299: Avoid specifying long messages outside the exception class

(TRY003)

344-344: Abstract raise to an inner function

(TRY301)

344-344: Avoid specifying long messages outside the exception class

(TRY003)

369-369: Do not catch blind exception: Exception

(BLE001)

372-372: Avoid specifying long messages outside the exception class

(TRY003)

AI/finder/main.py

24-24: Do not catch blind exception: Exception

(BLE001)

45-45: Do not catch blind exception: Exception

(BLE001)

66-66: Do not catch blind exception: Exception

(BLE001)

AI/transformer/modules/features.py

107-107: Ambiguous variable name: O

(E741)

107-107: Local variable O is assigned to but never used

Remove assignment to unused variable O

(F841)

AI/transformer/modules/inference.py

41-41: Do not catch blind exception: Exception

(BLE001)

53-53: Unused function argument: pred_h

(ARG001)

56-56: Unused function argument: interval

(ARG001)

92-92: Avoid specifying long messages outside the exception class

(TRY003)

151-151: Do not catch blind exception: Exception

(BLE001)

157-157: Multiple statements on one line (semicolon)

(E702)

159-159: Multiple statements on one line (semicolon)

(E702)

161-161: Multiple statements on one line (semicolon)

(E702)

190-190: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (12)

AI/libs/llm_clients/ollama_client.py (1)

11-19: LGTM!

서버 헬스체크 로직이 적절하게 구현되어 있습니다. 타임아웃 설정과 예외 처리가 올바르게 되어 있습니다.

AI/transformer/modules/features.py (3)

1-22: LGTM!

FEATURES 상수 정의가 명확하고, 포함된 기술 지표들이 적절합니다. CLOSE_RAW를 마지막에 배치하여 스케일링 제외 용도로 사용하는 것도 좋은 설계입니다.

25-88: LGTM!

기술 지표 계산 함수들이 잘 구현되어 있습니다. Wilder의 RSI/ATR 계산, NaN 처리, 분모가 0인 경우 처리 등이 적절하게 되어 있습니다.

90-131: LGTM!

build_features 함수가 잘 구현되어 있습니다. 컬럼명 정규화, 기술 지표 계산, NaN 처리가 체계적으로 이루어지고 있습니다.

AI/libs/utils/__init__.py (1)

2-8: LGTM!

패키지 초기화가 적절하게 구성되어 있습니다. __all__을 통해 공개 API를 명시적으로 정의한 것이 좋습니다.

AI/tests/test_transfomer.py (7)

1-36: LGTM! 임포트와 프로젝트 구조 설정이 적절합니다.

파일 구조와 임포트가 명확하고, 테스트 목적과 실행 방법이 docstring에 잘 문서화되어 있습니다.

43-73: 재시도 로직이 잘 구현되어 있습니다.

지수 백오프와 지터를 사용한 재시도 메커니즘이 적절합니다. Static analysis 경고들은 이 컨텍스트에서 false positive입니다:

random.uniform 사용은 암호화가 아닌 지터용이므로 적절함

재시도 래퍼에서 Exception 포괄 처리는 적절함

try 블록 내 return은 이 패턴에서 자연스러움

109-117: LGTM! Transformer 설정이 적절합니다.

features 목록에 "adjusted_close"가 포함되어 있으며, 이는 fetch_ohlcv가 반환하는 스키마와 일치합니다.

138-146: LGTM! 스키마 검증과 날짜 변환이 견고합니다.

필수 컬럼 검증과 날짜 타입 변환 로직이 적절하게 구현되어 있습니다. 누락된 컬럼을 명확히 보고하고, 날짜 파싱 오류를 안전하게 처리합니다.

151-169: LGTM! Transformer 호출이 정확합니다.

데이터 병합 로직과 run_transformer 호출 시그니처가 모두 올바르게 구현되어 있습니다. 빈 데이터에 대한 조기 종료 처리도 적절합니다.

171-202: LGTM! 결과 처리 로직이 포괄적이고 안전합니다.

결과 추출, 변환, 통계 출력이 방어적으로 구현되어 있습니다. 빈 데이터프레임과 누락된 컬럼에 대한 체크가 적절합니다.

208-209: LGTM! 엔트리포인트가 표준 패턴을 따릅니다.

coderabbitai · 2025-11-01T11:55:29Z

.gitignore

+################################################################################
+# 이 .gitignore 파일은 Microsoft(R) Visual Studio에서 자동으로 만들어졌습니다.
+################################################################################

-# 환경 변수 & 민감 정보
-.env
-*.secret
-*.key
-*.pem
-
-# 로그
-*.log
-
-# ===== Java (Spring Boot) =====
-/target/
-
-# ===== Node.js (React / Vite) =====
-node_modules/
-dist/
-
-# ===== Python (AI / Data) =====
-__pycache__/
-.ipynb_checkpoints/
-*.pyc
-/venv/
+/env
+/.vs


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

.gitignore 파일이 과도하게 단순화되어 보안 및 규정 준수 위험 발생

.gitignore가 광범위한 무시 패턴에서 최소한의 패턴(env, /.vs)으로 급격히 단순화되었습니다. 이는 다음과 같은 심각한 위험을 초래합니다:

보안 및 규정 준수 문제:

민감한 정보(API 키, 자격증, .env 파일 등)가 실수로 커밋될 수 있음

PR 설명에서 weights 디렉토리에 학습 데이터를 저장한다고 언급했으나, 이 폴더가 명시적으로 무시되지 않음

개인식별정보(PII) 또는 모델 아티팩트가 저장소에 노출될 가능성

개발 환경 관련 문제:

Python 캐시(__pycache__, *.pyc, .egg-info)가 추적되어 저장소 오염 야기

개발자별 IDE 설정(PyCharm, VS Code, *.swp, .vscode 등)이 커밋되어 충돌 유발

OS 특정 파일(.DS_Store, Thumbs.db 등)이 추적됨

권장 사항:
현재 변경사항의 의도를 확인하고, 최소한 다음 패턴들을 복구하거나 추가하십시오:

환경/의존성: venv/, .venv/, *.egg-info/, dist/, build/, *.whl

Python 캐시: __pycache__/, *.py[cod], *.so

IDE: .vscode/, .idea/, *.swp, *.sublime-*

OS: .DS_Store, Thumbs.db, .directory

로그/임시: *.log, /tmp/, *.tmp

모델 아티팩트: /weights/ (학습 데이터 저장 관련)

시스템: .env, .env.*, *.pid

이 변경이 의도적인지 확인해주시기 바랍니다. 의도적이라면, 보안 및 규정 준수 영향에 대해 검토팀과 논의하시기 바랍니다.

🏁 Script executed:

#!/bin/bash # Show git diff for .gitignore to see before/after git diff HEAD -- .gitignore | head -100

Length of output: 0

🏁 Script executed:

#!/bin/bash # List root directories/files to assess what should be ignored ls -la | head -30

Length of output: 819

🏁 Script executed:

#!/bin/bash # Search for 'weights' references in PR context or config files rg -i "weights" --type-add 'config:*.md,*.txt,*.json' -n -C2 2>/dev/null | head -50

Length of output: 3415

🏁 Script executed:

#!/bin/bash # Check if weights directory exists anywhere in repo fd -t d "weights|data|models|artifacts" -x ls -ld {} \; 2>/dev/null | head -20

Length of output: 197

🏁 Script executed:

#!/bin/bash # Display full current .gitignore content cat .gitignore

Length of output: 235

🏁 Script executed:

#!/bin/bash # Check git status of critical directories git status AI/transformer/weights AI/data 2>&1 | head -30

Length of output: 61

🏁 Script executed:

#!/bin/bash # Check if weights/data files are tracked by git git ls-files AI/transformer/weights AI/data 2>/dev/null | head -20

Length of output: 344

🏁 Script executed:

#!/bin/bash # Look for .gitignore history or backups find . -name "*gitignore*" -type f 2>/dev/null

Length of output: 33

🏁 Script executed:

#!/bin/bash # Check git log for .gitignore changes git log --oneline .gitignore 2>/dev/null | head -5

Length of output: 128

🏁 Script executed:

#!/bin/bash # Check backend/.gitignore for reference patterns cat backend/.gitignore

Length of output: 573

🏁 Script executed:

#!/bin/bash # Check commit message for context git show 5cbb3c7 --stat | head -30

Length of output: 1618

.gitignore 단순화가 의도적으로 수행되었으나 데이터 파일 노출 위험 존재

검증 결과, .gitignore 단순화(32줄 → 6줄)는 의도적인 결정으로 확인되었습니다. 동일 커밋(5cbb3c7)에서 AI/transformer/weights/initial.weights.h5와 transformer/scaler/scaler.pkl이 추가되었으며, 이는 모델 가중치를 버전 관리하려는 설계 결정으로 보입니다.

즉시 조치 필요:

학습 데이터 파일(AI/data/input_data/news_summary_*.csv, AI/data/stability_score_2025.csv)이 이미 저장소에 추적 중입니다. 이들은 .gitignore에 추가되어야 합니다.

팀 검토 필요:

모델 가중치를 git으로 버전 관리하는 것이 정책상 적절한지 검토 필요 (DVC 또는 모델 레지스트리 사용 권장)

현재 구조는 저장소 크기 및 보안 위험을 초래할 수 있습니다.

추가 권장 사항:
표준 패턴 복구: __pycache__/, *.pyc, .vscode/, .idea/, *.pyc, .DS_Store, *.log 등 (backend/.gitignore 참조)

🤖 Prompt for AI Agents

.gitignore lines 1-6: the simplified .gitignore removed many ignore patterns and allowed sensitive/large data and model files to be tracked (e.g., AI/data/input_data/news_summary_*.csv, AI/data/stability_score_2025.csv and model weights under AI/transformer); update .gitignore to add the specific data file patterns (AI/data/input_data/news_summary_*.csv and AI/data/stability_score_2025.csv) and common ignores (__pycache__/, *.pyc, .vscode/, .idea/, .DS_Store, *.log), and then remove tracked large/sensitive files from git history or untrack them (git rm --cached) and replace model weight tracking with DVC or a model registry per team policy.

coderabbitai · 2025-11-01T11:55:30Z

AI/configs/config.json

@@ -1,4 +1,4 @@
-{
+{


⚠️ Potential issue | 🟠 Major

BOM 문자를 제거하세요.

JSON 파일 첫 줄에 BOM 문자()가 추가되었습니다. 이는 JSON 파싱 오류를 일으킬 수 있습니다.

다음과 같이 수정하세요:

-{ +{

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{

{

🤖 Prompt for AI Agents

AI/configs/config.json lines 1-1: 파일의 첫 줄에 UTF-8 BOM 문자()가 포함되어 있어 JSON 파서 오류를 유발하므로 해당 BOM 문자를 제거하세요; 편집기(또는 도구)를 사용해 파일을 UTF-8 without BOM 형식으로 저장하거나 파일의 맨 앞에서 보이지 않는 BOM 문자 하나를 삭제한 뒤 저장하면 됩니다.

coderabbitai · 2025-11-01T11:55:30Z

AI/finder/main.py

+        return []
+
+    csv_path = os.path.join(project_root, "data", "stability_score_2025.csv")


⚠️ Potential issue | 🔴 Critical

임시 하드코딩으로 Finder 단계가 무력화됩니다

run_finder() 호출을 주석 처리하고 고정된 티커를 반환하면서 Finder 단계가 완전히 우회됩니다. 이렇게 되면 실제 뉴스/지표 기반 분석 결과가 전혀 반영되지 않아 파이프라인 핵심 기능이 깨집니다. 원래 로직을 복구하고, 필요하면 빈 결과에 대해서만 안전하게 종료하도록 바꿔주세요.

- #top_tickers = run_finder() - top_tickers = ['AAPL', 'MSFT', 'GOOGL'] # 임시 데이터 + top_tickers = run_finder() + if not top_tickers: + print("Finder에서 종목을 찾지 못했습니다.") + return []

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In AI/finder/main.py around lines 26 to 28, the Finder stage has been bypassed by commenting out the run_finder() call and returning a hardcoded ticker list; restore the original call to run_finder() (uncomment or reintroduce it) and remove the temporary hardcoded return so the pipeline uses actual news/metric results, and add a safe early-exit guard that only returns an empty list when run_finder() legitimately yields no candidates (or raises a handled exception) so downstream logic remains stable.

다음 패치때 변경 예정

@twq110, 알겠습니다! 다음 패치에서 수정하시는 것으로 확인했습니다.

필요하시면 추적을 위한 이슈를 생성해드릴 수 있습니다. 이슈를 생성하시겠습니까?

coderabbitai · 2025-11-01T11:55:30Z

AI/libs/utils/__init__.py

@@ -0,0 +1,8 @@
+# AI/libs/utils/__init__.py


⚠️ Potential issue | 🟠 Major

BOM 문자를 제거하세요.

파일 첫 줄에 BOM 문자()가 포함되어 있습니다. 이는 인코딩 문제를 일으킬 수 있습니다.

다음과 같이 수정하세요:

-# AI/libs/utils/__init__.py +# AI/libs/utils/__init__.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# AI/libs/utils/__init__.py

# AI/libs/utils/__init__.py

🤖 Prompt for AI Agents

AI/libs/utils/__init__.py lines 1-1: the file contains a leading BOM character on the first line; remove the BOM so the file is plain UTF-8 (no BOM). Open the file in an editor that can show invisible characters or use a tool/command to strip the BOM, delete the leading invisible character, and re-save the file as UTF-8 without BOM; verify by reopening and ensuring the first character is the expected ASCII character (or start of file) and no BOM remains.

coderabbitai · 2025-11-01T11:55:30Z

AI/requirements.txt

@@ -1,4 +1,4 @@
-pandas
+pandas


⚠️ Potential issue | 🟠 Major

BOM(Byte Order Mark) 문자를 제거하세요.

파일 첫 줄에 BOM 문자()가 포함되어 있습니다. 이는 인코딩 문제를 일으킬 수 있으며, pip 설치 시 오류를 유발할 수 있습니다.

다음과 같이 수정하세요:

-pandas +pandas

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pandas

pandas

🤖 Prompt for AI Agents

In AI/requirements.txt around lines 1 to 1, the first line contains a BOM (Byte Order Mark) character before "pandas"; remove the leading BOM so the file is saved as UTF-8 without BOM and the line begins exactly with "pandas" (no hidden characters), then re-save the file and commit the change.

coderabbitai · 2025-11-01T11:55:31Z

AI/transformer/__init__.py

@@ -1,3 +1,3 @@
-# AI/finder/__init__.py
+# AI/run_transformer/__init__.py


⚠️ Potential issue | 🟡 Minor

주석의 파일 경로가 실제 파일 경로와 일치하지 않습니다.

주석에는 AI/run_transformer/__init__.py라고 되어 있지만, 실제 파일 경로는 AI/transformer/__init__.py입니다. 또한 BOM 문자도 제거해야 합니다.

다음과 같이 수정하세요:

-# AI/run_transformer/__init__.py +# AI/transformer/__init__.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# AI/run_transformer/__init__.py

# AI/transformer/__init__.py

🤖 Prompt for AI Agents

In AI/transformer/__init__.py around lines 1 to 1, the file contains a leading BOM and an incorrect header comment referencing AI/run_transformer/__init__.py; remove the BOM (byte order mark) from the top of the file and update the header/comment so the path matches AI/transformer/__init__.py (or delete the misleading comment entirely), ensuring the file starts with a normal UTF-8 character and correct path annotation.

AI/transformer/main.py

coderabbitai · 2025-11-01T11:55:31Z

AI/transformer/modules/features.py

+    O = df["open"].astype(float)
+    H = df["high"].astype(float)
+    L = df["low"].astype(float)
+    C = df["close"].astype(float)
+    V = df["volume"].astype(float)


⚠️ Potential issue | 🟡 Minor

사용하지 않는 변수를 제거하세요.

변수 O (open price)가 할당되었지만 사용되지 않습니다. 코드를 간결하게 유지하기 위해 제거하세요.

다음과 같이 수정하세요:

- O = df["open"].astype(float) H = df["high"].astype(float) L = df["low"].astype(float) C = df["close"].astype(float) V = df["volume"].astype(float)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

O = df["open"].astype(float)

H = df["high"].astype(float)

L = df["low"].astype(float)

C = df["close"].astype(float)

V = df["volume"].astype(float)

H = df["high"].astype(float)

L = df["low"].astype(float)

C = df["close"].astype(float)

V = df["volume"].astype(float)

🧰 Tools

🪛 Ruff (0.14.2)

107-107: Ambiguous variable name: O

(E741)

107-107: Local variable O is assigned to but never used

Remove assignment to unused variable O

(F841)

🤖 Prompt for AI Agents

In AI/transformer/modules/features.py around lines 107 to 111, the variable O is assigned from df["open"] but never used; remove the unused assignment line (O = df["open"].astype(float)) so only H, L, C, V are created, then run linters/tests to confirm there are no remaining references to O.

이거 원래 써야하는데 안쓰는거 이상해. 이슈달아줘

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Create a GitHub issue to track the unused open price variable gh issue create \ --title "[FEAT] Open price (O) 변수가 features.py에서 정의되었으나 사용되지 않음" \ --body "## 설명 \`AI/transformer/modules/features.py\` 파일의 \`build_features\` 함수에서 open price 변수 \`O\`가 정의되지만 실제로 사용되지 않고 있습니다. ## 위치 - 파일: \`AI/transformer/modules/features.py\` - 라인: 107-111 ## 상세 내용 현재 OHLCV 데이터에서 다음과 같이 변수들이 정의됩니다: - \`O\` = open price (정의되었으나 미사용) - \`H\` = high price - \`L\` = low price - \`C\` = close price - \`V\` = volume Open price는 일부 기술 지표 계산에 필요할 수 있으므로, 이 변수가 어디에서 사용되어야 하는지 검토가 필요합니다. ## 관련 링크 - PR: https://github.com/SISC-IT/sisc-web/pull/71 - 코멘트: https://github.com/SISC-IT/sisc-web/pull/71#discussion_r2483685088 - 요청자: @twq110" \ --assignee twq110

Length of output: 45

@twq110, 이슈를 생성했습니다!

Open price 변수 O가 정의되었으나 사용되지 않는 문제를 추적하기 위한 이슈입니다:

Issue [FEAT] Open price (O) 변수가 features.py에서 정의되었으나 사용되지 않음 #72: [FEAT] Open price (O) 변수가 features.py에서 정의되었으나 사용되지 않음 #72

해당 이슈에서 O 변수가 어떤 기술 지표 계산에 사용되어야 하는지 검토하실 수 있습니다. 이슈는 담당자님께 할당되었습니다.

coderabbitai · 2025-11-01T11:55:31Z

AI/transformer/training/__init__.py

@@ -0,0 +1,3 @@
+# AI/training/train_transformer/__init__.py


⚠️ Potential issue | 🟠 Major

BOM 문자와 주석의 파일 경로 오류를 수정하세요.

파일 첫 줄에 BOM 문자()가 포함되어 있으며, 주석의 경로가 실제 파일 경로와 다릅니다.

다음과 같이 수정하세요:

-# AI/training/train_transformer/__init__.py +# AI/transformer/training/__init__.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# AI/training/train_transformer/__init__.py

# AI/transformer/training/__init__.py

🤖 Prompt for AI Agents

In AI/transformer/training/__init__.py around line 1, remove the leading BOM character (U+FEFF) from the file start and update the file-header comment path to the correct path (AI/transformer/training/__init__.py) so the comment matches the actual file location; save the file without BOM (UTF-8 without BOM) and ensure the corrected comment is the first line.

coderabbitai · 2025-11-01T11:55:31Z

AI/transformer/training/__init__.py

+from .train_transformer import train_transformer_classifier
+__all__ = ["run_transformer"]


⚠️ Potential issue | 🔴 Critical

import와 export가 일치하지 않습니다.

train_transformer_classifier를 import하지만 run_transformer를 export하고 있습니다. 이는 런타임에 ImportError를 발생시킵니다.

다음 중 하나로 수정하세요:

옵션 1: export를 import와 일치시키기

-__all__ = ["run_transformer"] +__all__ = ["train_transformer_classifier"]

옵션 2: train_transformer_classifier를 run_transformer로 별칭 지정

from .train_transformer import train_transformer_classifier + +run_transformer = train_transformer_classifier + __all__ = ["run_transformer"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from .train_transformer import train_transformer_classifier

__all__ = ["run_transformer"]

from .train_transformer import train_transformer_classifier

__all__ = ["train_transformer_classifier"]

Suggested change

from .train_transformer import train_transformer_classifier

__all__ = ["run_transformer"]

from .train_transformer import train_transformer_classifier

run_transformer = train_transformer_classifier

__all__ = ["run_transformer"]

🤖 Prompt for AI Agents

AI/transformer/training/__init__.py lines 2-3: the module imports train_transformer_classifier but exports run_transformer, causing an ImportError; fix by making the import/export consistent — either change __all__ to ["train_transformer_classifier"] to export the imported name, or alias the import with from .train_transformer import train_transformer_classifier as run_transformer and keep __all__ = ["run_transformer"] so the exported name matches the public API.

.gitignore 의도치 않은 삭제 복구

DB연결 실패시 종료 코드 실행

run_transformer 를 transformer로 변경

twq110

몆몆 이슈 발견 수정예정

AI/transformer/main.py

twq110 · 2025-11-01T13:07:44Z

AI/transformer/modules/features.py

+    O = df["open"].astype(float)
+    H = df["high"].astype(float)
+    L = df["low"].astype(float)
+    C = df["close"].astype(float)
+    V = df["volume"].astype(float)


이거 원래 써야하는데 안쓰는거 이상해. 이슈달아줘

[AI] SISC2-37 [FEAT] transform 학습 파일 저장기능 구현

5cbb3c7

twq110 requested a review from Kosw6 as a code owner November 1, 2025 11:46

coderabbitai bot reviewed Nov 1, 2025

View reviewed changes

twq110 added 3 commits November 1, 2025 21:54

[AI] SISC2-37 [FIX] .gitignore 롤백

014a9bd

.gitignore 의도치 않은 삭제 복구

[AI] SISC2-37 [FIX] DB연결 종료 코드 수정

a554bca

DB연결 실패시 종료 코드 실행

[AI] SISC2-37 [STYLE] 오타수정

e355350

run_transformer 를 transformer로 변경

twq110 commented Nov 1, 2025

View reviewed changes

twq110 merged commit 561a240 into main Nov 1, 2025
1 check was pending

twq110 deleted the SISC2-37-AI-석재빈-trasform-학습-파일-저장기능-구현 branch November 1, 2025 13:12

This was referenced Nov 1, 2025

[FEAT] Open price (O) 변수가 features.py에서 정의되었으나 사용되지 않음 #72

Open

[FEAT] transformer weights_path 로딩 로직 개선 필요 #73

Closed

[AI] SISC2-39 [FIX] 최종 파이프라인 검증 #80

Merged

coderabbitai bot mentioned this pull request Nov 20, 2025

[AI] Sisc2 46 ai 자산배분 학습 코드 작성 #101

Merged

coderabbitai bot mentioned this pull request Jan 17, 2026

[AI] SISC-185 [FEAT] 강화학습 기본코드 작성 #186

Merged

This was referenced Jan 24, 2026

Feat/sisc 185 ai transformer model upgrade #194

Merged

Feat/sisc 195 data collactor supplement #199

Merged

coderabbitai bot mentioned this pull request Feb 2, 2026

[AI] SISC-195 계산 프로세스 피쳐 코드 수정 리펙토링 #201

Open

		return []

		csv_path = os.path.join(project_root, "data", "stability_score_2025.csv")

		@@ -1,3 +1,3 @@
		# AI/finder/__init__.py
		# AI/run_transformer/__init__.py

	# AI/run_transformer/__init__.py
	# AI/transformer/__init__.py

	# AI/training/train_transformer/__init__.py
	# AI/transformer/training/__init__.py

		from .train_transformer import train_transformer_classifier
		__all__ = ["run_transformer"]

[AI] SISC2-37 [FEAT] transform 학습 파일 저장기능 구현 #71

[AI] SISC2-37 [FEAT] transform 학습 파일 저장기능 구현 #71

Uh oh!

Conversation

twq110 commented Nov 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

릴리스 노트

Uh oh!

coderabbitai bot commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

시

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

twq110 Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

twq110 Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

twq110 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

twq110 Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

twq110 commented Nov 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 1, 2025 •

edited

Loading

coderabbitai bot Nov 1, 2025 •

edited

Loading

coderabbitai bot Nov 1, 2025 •

edited

Loading