Estimating Short-Term Price Direction with Heuristics and News Sentiment (Python)
Predicting market movements is famously hard. Instead of claiming “AI will predict prices,” this Python script takes a more honest and practical approach:
Estimate the probability that price will move up or down in the near term using transparent heuristics.
This article explains the script’s design, the signals it blends, and how you can extend it.
Why a heuristic instead of a prediction model?
This script:
- ❌ does not predict price targets
- ❌ does not promise alpha
- ✅ produces a probabilistic directional bias
- ✅ remains interpretable
- ✅ combines price action + sentiment
That makes it useful for:
- trade-bias confirmation
- risk context and dashboards
- human-in-the-loop decision systems
- education and research
High-level architecture
The script estimates Prob Up vs Prob Down from five signal groups:
| Signal type | What it captures |
|---|---|
| Momentum | Recent returns (1, 5, 20 bars) |
| RSI | Overbought/oversold context |
| Trend | Price vs EMA(20) |
| News sentiment | Headline polarity (keyword or LLM) |
| Macro bias | Gold-aware tilt from macro keywords |
These are blended into a single score, then mapped into probabilities via a logistic function.
Step 1: Market data → momentum features
The script fetches OHLCV data and requires enough history (e.g., 30+ bars) to compute indicators.
It derives simple momentum features:
- ret1 = 1-bar return
- ret5 = 5-bar return
- ret20 = 20-bar return
Momentum is intentionally “plain”—no complex pattern mining—because the goal is explainable bias, not overfit prediction.
Step 2: RSI as context (not a hard rule)
RSI(14) is computed (commonly via EMA-smoothed gains/losses) and converted into a continuous score.
Instead of rigid thresholds (“RSI > 70 = sell”), the script normalizes RSI around 50:
- RSI above 50 contributes bullish bias
- RSI below 50 contributes bearish bias
This makes RSI compatible with other numeric signals.
Step 3: Trend via EMA(20)
A short-term trend check compares current price to EMA(20):
- price above EMA(20) → mild bullish tilt
- price below EMA(20) → mild bearish tilt
This answers a simple question:
Is price trading above or below its recent trend baseline?
Step 4: News sentiment (two modes)
Mode A: Keyword-based sentiment (fast + deterministic)
Headlines are scored using a whitelist/blacklist of words.
Bullish examples
- beats, upgrade, strong, record
Bearish examples
- miss, downgrade, lawsuit, warning
Scores are normalized to roughly [-1, +1].
Mode B: Optional LLM sentiment (Ollama)
If enabled, the script uses a local LLM (Ollama) to classify news text as:
- bullish
- neutral
- bearish
LLM mode is optional and meant to stay:
- local
- auditable
- supportive (not dominant)
Step 5: Macro keyword tilt (gold-aware)
For gold-related symbols (e.g., XAU, GOLD, GC=F), macro context matters.
The script boosts sentiment when headlines mention:
- war / conflict
- central bank buying
- inflation
- debt
- rate cuts
- currency debasement
And penalizes:
- rate hikes
- hawkish policy
- strong dollar
This prevents an equity-centric sentiment model from misreading commodity drivers.
Step 6: Blending into a score
Signals are combined with simple weights (example):
- momentum dominates (ret5, ret20)
- RSI and trend add context
- news sentiment supports (doesn’t lead)
This weighted score is intentionally conservative: no single feature should overwhelm the result.
Step 7: Score → probability
A logistic (sigmoid) function converts the score into probabilities:
- prob_up = sigmoid(score × scale)
- prob_down = 1 − prob_up
Logistic mapping is useful because it:
- stays bounded between 0 and 1
- responds smoothly to changing signals
- naturally represents uncertainty
Many implementations also cap extremes (e.g., 1%–99%) to avoid false certainty.
Example output (how to read it)
A typical run might produce:
- Prob Up: 63%
- Prob Down: 37%
With a breakdown like:
- ret5 positive
- ret20 positive
- RSI moderately above 50
- price above EMA(20)
- news sentiment slightly bullish
Interpretation:
“Given recent momentum, trend, and news, upward movement is more likely than downward—yet uncertainty remains.”
What this script is (and isn’t)
✅ Good for
- directional bias / confirmation
- trade filtering (only act when bias is strong)
- risk dashboards / monitoring
- research and education
❌ Not for
- price targets
- high-frequency trading
- fully automated execution
- “guaranteed” prediction claims
Practical extensions
If you want to evolve the script while keeping it interpretable:
- volatility-adjusted weighting (scale returns by ATR or realized vol)
- regime detection (trend vs mean-reversion)
- time-decay for news (fresh headlines matter more)
- symbol-specific calibration (gold vs stocks vs crypto)
- portfolio aggregation (bias across multiple instruments)
Closing thoughts
This script is a solid example of post-hype engineering:
- transparent
- interpretable
- honest about uncertainty
Instead of asking “Can AI predict the market?”, it asks a better question:
Given what we know right now, which direction is more plausible?
Get in Touch with us
Related Posts
- 用纯开源方案搭建生产级 SOC:Wazuh + DFIR-IRIS + 自研集成层实战记录
- How We Built a Real Security Operations Center With Open-Source Tools
- FarmScript:我们如何从零设计一门农业IoT领域特定语言
- FarmScript: How We Designed a Programming Language for Chanthaburi Durian Farmers
- 智慧农业项目为何止步于试点阶段
- Why Smart Farming Projects Fail Before They Leave the Pilot Stage
- ERP项目为何总是超支、延期,最终令人失望
- ERP Projects: Why They Cost More, Take Longer, and Disappoint More Than Expected
- AI Security in Production: What Enterprise Teams Must Know in 2026
- 弹性无人机蜂群设计:具备安全通信的无领导者容错网状网络
- Designing Resilient Drone Swarms: Leaderless-Tolerant Mesh Networks with Secure Communications
- NumPy广播规则详解:为什么`(3,)`和`(3,1)`行为不同——以及它何时会悄悄给出错误答案
- NumPy Broadcasting Rules: Why `(3,)` and `(3,1)` Behave Differently — and When It Silently Gives Wrong Answers
- 关键基础设施遭受攻击:从乌克兰电网战争看工业IT/OT安全
- Critical Infrastructure Under Fire: What IT/OT Security Teams Can Learn from Ukraine’s Energy Grid
- LM Studio代码开发的系统提示词工程:`temperature`、`context_length`与`stop`词详解
- LM Studio System Prompt Engineering for Code: `temperature`, `context_length`, and `stop` Tokens Explained
- LlamaIndex + pgvector: Production RAG for Thai and Japanese Business Documents
- simpliShop:专为泰国市场打造的按需定制多语言电商平台
- simpliShop: The Thai E-Commerce Platform for Made-to-Order and Multi-Language Stores













