股市时常显现一种独特的走势:牛市在不经意间启动,又在狂热中落幕。这一现象背后的原因究竟是什么?这是众多投资者所高度关注的焦点问题。
股市牛熊转换中的交易机会
近期股市频繁经历牛市与熊市的转换。在熊市持续的冲击中,股价持续走低,如同2014年以前,众多股票价格跌至低谷。在此期间,多数普通投资者丧失了投资热情,市场显得较为冷清。不过,部分敏锐的资金在此阶段开始暗中布局。例如,一些私募基金在股价较低时积极吸纳筹码。当这些资金积累到一定规模后,便开始推高股价并制造舆论,引导公众跟风。
投资者通常在股价攀升阶段加入市场。以2015年上半年为例,股价攀升带动了众多新投资者进入股市,他们作为买家接手,并未察觉到潜在风险,最终遭遇股市暴跌,遭受了巨大损失。
量化分析中的数据来源
<p><pre> <code class="prism language-python"><span class="token keyword">import</span> talib <span class="token keyword">as</span> ta<span class="token comment">#技术分析模块</span>
<span class="token keyword">import</span> pandas <span class="token keyword">as</span> pd
</code></pre></p>
为了更深入地研究股市动态与公众关注度之间的关联,本研究采用了从微博平台收集数据的方法。样本选取了10位具有代表性的博主,其微博内容涵盖多个领域,既有财经领域的知名博主,也有普通投资者的分享账号。在技术处理方面,本研究利用了talib模块等工具。首先,对每位博主的信息进行读取,并整理出每日发博数量、平均点赞数等关键数据。随后,将多位博主的相关信息进行汇总,以确保数据能够尽可能全面且准确地反映出微博的热度状况。
以微博热度代表舆论情况
<p><pre> <code class="prism language-python"><span class="token keyword">def</span> <span class="token function">weibo_hot</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">:</span><span class="token comment">#name是读取的微博博主名字</span>
f <span class="token operator">=</span> <span class="token builtin">open</span><span class="token punctuation">(</span><span class="token string">'D:\\python_project\\crawl\\weibo\\base_data_%s.txt'</span><span class="token operator">%</span>name<span class="token punctuation">,</span><span class="token string">'r+'</span><span class="token punctuation">,</span>encoding<span class="token operator">=</span><span class="token string">'utf-8'</span><span class="token punctuation">)</span><span class="token comment">#读取数据</span>
dic <span class="token operator">=</span> <span class="token builtin">eval</span><span class="token punctuation">(</span>f<span class="token punctuation">.</span>read<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#读取数据转化为字典</span>
f<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span>
data_date <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token comment">#储存每个日期对应的微博数据</span>
<span class="token keyword">for</span> item <span class="token keyword">in</span> dic<span class="token punctuation">.</span>values<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
date <span class="token operator">=</span> item<span class="token punctuation">[</span><span class="token string">'created_at'</span><span class="token punctuation">]</span>
<span class="token keyword">if</span> date <span class="token keyword">in</span> data_date<span class="token punctuation">:</span><span class="token comment">#data_date按点赞数、评论数、转发数来存储数据</span>
data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token punctuation">[</span>item<span class="token punctuation">[</span><span class="token string">'attitudes'</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">'comments'</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">'reposts'</span><span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
<span class="token keyword">else</span><span class="token punctuation">:</span>
data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">[</span>item<span class="token punctuation">[</span><span class="token string">'attitudes'</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">'comments'</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">'reposts'</span><span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">]</span>
date_all <span class="token operator">=</span> <span class="token builtin">sorted</span><span class="token punctuation">(</span>data_date<span class="token punctuation">)</span><span class="token punctuation">[</span><span class="token number">500</span><span class="token punctuation">:</span><span class="token punctuation">]</span><span class="token comment">#将data_date中的微博数据按日期排列,排列顺序为日期最早的在最前面,且只取开通微博500天后的数据</span>
used_data <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token comment">#存储用得到的数据</span>
<span class="token keyword">for</span> date <span class="token keyword">in</span> date_all<span class="token punctuation">:</span>
number <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span>
attitudes_sum <span class="token operator">=</span> <span class="token number">0</span>
comments_sum <span class="token operator">=</span> <span class="token number">0</span>
reposts_sum <span class="token operator">=</span> <span class="token number">0</span>
<span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
attitudes_sum <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总点赞数</span>
comments_sum <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总评论数</span>
reposts_sum <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总转发数</span>
attitudes <span class="token operator">=</span> <span class="token punctuation">(</span>attitudes_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#求平均</span>
comments <span class="token operator">=</span> <span class="token punctuation">(</span>comments_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
reposts <span class="token operator">=</span> <span class="token punctuation">(</span>reposts_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
used_data<span class="token punctuation">[</span>date<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span>number<span class="token punctuation">,</span>attitudes<span class="token punctuation">,</span>comments<span class="token punctuation">,</span>reposts<span class="token punctuation">]</span>
<span class="token keyword">return</span> used_data<span class="token comment">#返回的是字典</span>
</code></pre></p>
微博热度能否反映公众对股市的关注度?为此,我们选取了每日每位博主发布的微博数量、每条微博的平均点赞数、评论数和转发数作为衡量热度的指标。这些指标有助于较为公正地展现微博用户对股市话题的关注程度。以2014至2015年的牛市初期为例,尽管微博热度有所上升,但增长速度较慢,这主要是因为当时大多数公众尚未开始关注股市动态。
<p><pre> <code class="prism language-python"><span class="token keyword">def</span> <span class="token function">outcome</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
all_person <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">'李大霄'</span><span class="token punctuation">,</span><span class="token string">'花荣'</span><span class="token punctuation">,</span><span class="token string">'上海徐晓峰'</span><span class="token punctuation">,</span><span class="token string">'天津股侠'</span><span class="token punctuation">,</span><span class="token string">'微博股票'</span><span class="token punctuation">,</span><span class="token string">'天狼50陈浩'</span><span class="token punctuation">,</span><span class="token string">'雪球'</span><span class="token punctuation">,</span><span class="token string">'云财经'</span><span class="token punctuation">,</span><span class="token string">'宇辉战舰'</span><span class="token punctuation">,</span><span class="token string">'港股通AiH'</span><span class="token punctuation">]</span><span class="token comment">#所有人的微博名</span>
outcome_data <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span>columns<span class="token operator">=</span><span class="token punctuation">(</span><span class="token string">'number'</span><span class="token punctuation">,</span><span class="token string">'attitudes'</span><span class="token punctuation">,</span><span class="token string">'comments'</span><span class="token punctuation">,</span><span class="token string">'reposts'</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#新建一个空DataFrame</span>
each_num <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
<span class="token keyword">for</span> person <span class="token keyword">in</span> all_person<span class="token punctuation">:</span>
data <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span>weibo_hot<span class="token punctuation">(</span>name <span class="token operator">=</span> person<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span>T<span class="token comment">#返回的是字典</span>
<span class="token keyword">for</span> each <span class="token keyword">in</span> data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span><span class="token comment">#index是日期</span>
<span class="token keyword">if</span> each <span class="token operator">not</span> <span class="token keyword">in</span> outcome_data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span>
each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token number">1</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token string">'number'</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">'attitudes'</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">'comments'</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">'reposts'</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">3</span><span class="token punctuation">]</span><span class="token punctuation">}</span>
<span class="token keyword">else</span><span class="token punctuation">:</span>
each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">+=</span> <span class="token number">1</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'number'</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'attitudes'</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'comments'</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'reposts'</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">3</span><span class="token punctuation">]</span>
<span class="token keyword">for</span> each <span class="token keyword">in</span> outcome_data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'number'</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均发博</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'attitudes'</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均点赞数</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'comments'</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均评论数</span>
outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">'reposts'</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均转发数</span>
outcome_data <span class="token operator">=</span> outcome_data<span class="token punctuation">.</span>sort_index<span class="token punctuation">(</span>inplace <span class="token operator">=</span> <span class="token boolean">False</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> outcome_data
</code></pre></p>
中小板指与微博热度关系
对比指数与微博热度,鉴于上证指数可能因国家队干预而失真,故选取中小板指作为分析对象。观察2014年牛市初期,中小板指数呈上升趋势,微博热度亦逐渐上升,但初期并未引发公众的高度关注。进入2015年中期,股灾前夕,微博热度急剧攀升,与此同时,中小板指数处于高位,预示着即将迎来暴跌。股灾发生后,微博热度逐步回归至正常水平。
<p><pre> <code class="prism language-python"><span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">'__main__'</span><span class="token punctuation">:</span>
outcome_data <span class="token operator">=</span> outcome<span class="token punctuation">(</span><span class="token punctuation">)</span>
number_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>number<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
attitudes_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>attitudes<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
comments_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>comments<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
reposts_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>reposts<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
</code></pre></p>
不同行情下的热度表现
2017至2018年间,白马股股价攀升,然而微博讨论热度却出现下降趋势。这一现象暗示市场上涨主要由机构推动,例如大型基金集中投资于白马股,散户参与度较低。近期微博热度回升,反映出当前股市上涨中散户参与增多,或许预示着新的牛市到来,尽管这只是其中一种可能性。
数据来源的潜在改进方向
此研究基于微博数据进行了多项分析,但需注意,微博数据并非金融专业数据,存在一定限制。专业投资平台如雪球可能提供更高质量的数据。此外,若将社交舆论监控融入量化交易系统,例如通过插件方式,将有助于更精准地分析股市点位和识别风险,从而显著提高交易决策的精确度。
您是否相信新一轮的牛市将伴随热度增长而出现?敬请于评论区发表您的观点。同时,不妨为本文点赞及转发。 |
|