牛市开启与结束的舆论热度分析:从微博数据看股市投资时机

[复制链接]
查看50 | 回复0 | 7 天前 | 显示全部楼层 |阅读模式
股市时常显现一种独特的走势:牛市在不经意间启动,又在狂热中落幕。这一现象背后的原因究竟是什么?这是众多投资者所高度关注的焦点问题。

股市牛熊转换中的交易机会

近期股市频繁经历牛市与熊市的转换。在熊市持续的冲击中,股价持续走低,如同2014年以前,众多股票价格跌至低谷。在此期间,多数普通投资者丧失了投资热情,市场显得较为冷清。不过,部分敏锐的资金在此阶段开始暗中布局。例如,一些私募基金在股价较低时积极吸纳筹码。当这些资金积累到一定规模后,便开始推高股价并制造舆论,引导公众跟风。

投资者通常在股价攀升阶段加入市场。以2015年上半年为例,股价攀升带动了众多新投资者进入股市,他们作为买家接手,并未察觉到潜在风险,最终遭遇股市暴跌,遭受了巨大损失。

量化分析中的数据来源

<p><pre>    <code class="prism language-python"><span class="token keyword">import</span> talib <span class="token keyword">as</span> ta<span class="token comment">#技术分析模块</span>
<span class="token keyword">import</span> pandas <span class="token keyword">as</span> pd
</code></pre></p>
为了更深入地研究股市动态与公众关注度之间的关联,本研究采用了从微博平台收集数据的方法。样本选取了10位具有代表性的博主,其微博内容涵盖多个领域,既有财经领域的知名博主,也有普通投资者的分享账号。在技术处理方面,本研究利用了talib模块等工具。首先,对每位博主的信息进行读取,并整理出每日发博数量、平均点赞数等关键数据。随后,将多位博主的相关信息进行汇总,以确保数据能够尽可能全面且准确地反映出微博的热度状况。

以微博热度代表舆论情况

<p><pre>    <code class="prism language-python"><span class="token keyword">def</span> <span class="token function">weibo_hot</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">:</span><span class="token comment">#name是读取的微博博主名字</span>
    f <span class="token operator">=</span> <span class="token builtin">open</span><span class="token punctuation">(</span><span class="token string">&#39;D:\\python_project\\crawl\\weibo\\base_data_%s.txt&#39;</span><span class="token operator">%</span>name<span class="token punctuation">,</span><span class="token string">&#39;r+&#39;</span><span class="token punctuation">,</span>encoding<span class="token operator">=</span><span class="token string">&#39;utf-8&#39;</span><span class="token punctuation">)</span><span class="token comment">#读取数据</span>
    dic <span class="token operator">=</span> <span class="token builtin">eval</span><span class="token punctuation">(</span>f<span class="token punctuation">.</span>read<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#读取数据转化为字典</span>
    f<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span>
   
    data_date <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token comment">#储存每个日期对应的微博数据</span>
    <span class="token keyword">for</span> item <span class="token keyword">in</span> dic<span class="token punctuation">.</span>values<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
        date <span class="token operator">=</span> item<span class="token punctuation">[</span><span class="token string">&#39;created_at&#39;</span><span class="token punctuation">]</span>
        <span class="token keyword">if</span> date <span class="token keyword">in</span> data_date<span class="token punctuation">:</span><span class="token comment">#data_date按点赞数、评论数、转发数来存储数据</span>
            data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token punctuation">[</span>item<span class="token punctuation">[</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
        <span class="token keyword">else</span><span class="token punctuation">:</span>
            data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">[</span>item<span class="token punctuation">[</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>item<span class="token punctuation">[</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">]</span>
   
    date_all <span class="token operator">=</span> <span class="token builtin">sorted</span><span class="token punctuation">(</span>data_date<span class="token punctuation">)</span><span class="token punctuation">[</span><span class="token number">500</span><span class="token punctuation">:</span><span class="token punctuation">]</span><span class="token comment">#将data_date中的微博数据按日期排列,排列顺序为日期最早的在最前面,且只取开通微博500天后的数据</span>
   
    used_data <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token comment">#存储用得到的数据</span>
    <span class="token keyword">for</span> date <span class="token keyword">in</span> date_all<span class="token punctuation">:</span>


        number <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span>
        attitudes_sum <span class="token operator">=</span> <span class="token number">0</span>
        comments_sum <span class="token operator">=</span> <span class="token number">0</span>
        reposts_sum <span class="token operator">=</span> <span class="token number">0</span>
        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
            attitudes_sum  <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总点赞数</span>
            comments_sum  <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总评论数</span>
            reposts_sum  <span class="token operator">+=</span> data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token comment">#该日期所有微博总转发数</span>
        attitudes <span class="token operator">=</span> <span class="token punctuation">(</span>attitudes_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#求平均</span>
        comments <span class="token operator">=</span> <span class="token punctuation">(</span>comments_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
        reposts <span class="token operator">=</span> <span class="token punctuation">(</span>reposts_sum<span class="token operator">/</span><span class="token builtin">len</span><span class="token punctuation">(</span>data_date<span class="token punctuation">[</span>date<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
        used_data<span class="token punctuation">[</span>date<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span>number<span class="token punctuation">,</span>attitudes<span class="token punctuation">,</span>comments<span class="token punctuation">,</span>reposts<span class="token punctuation">]</span>
    <span class="token keyword">return</span> used_data<span class="token comment">#返回的是字典</span>
</code></pre></p>
微博热度能否反映公众对股市的关注度?为此,我们选取了每日每位博主发布的微博数量、每条微博的平均点赞数、评论数和转发数作为衡量热度的指标。这些指标有助于较为公正地展现微博用户对股市话题的关注程度。以2014至2015年的牛市初期为例,尽管微博热度有所上升,但增长速度较慢,这主要是因为当时大多数公众尚未开始关注股市动态。

<p><pre>    <code class="prism language-python"><span class="token keyword">def</span> <span class="token function">outcome</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
    all_person <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">&#39;李大霄&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;花荣&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;上海徐晓峰&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;天津股侠&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;微博股票&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;天狼50陈浩&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;雪球&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;云财经&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;宇辉战舰&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;港股通AiH&#39;</span><span class="token punctuation">]</span><span class="token comment">#所有人的微博名</span>
    outcome_data <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span>columns<span class="token operator">=</span><span class="token punctuation">(</span><span class="token string">&#39;number&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">,</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token comment">#新建一个空DataFrame</span>
    each_num <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token keyword">for</span> person <span class="token keyword">in</span> all_person<span class="token punctuation">:</span>
        data <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span>weibo_hot<span class="token punctuation">(</span>name <span class="token operator">=</span> person<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span>T<span class="token comment">#返回的是字典</span>
        <span class="token keyword">for</span> each <span class="token keyword">in</span> data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span><span class="token comment">#index是日期</span>
            <span class="token keyword">if</span> each <span class="token operator">not</span> <span class="token keyword">in</span> outcome_data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span>
                each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token number">1</span>
                outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token string">&#39;number&#39;</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">,</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">:</span>data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">3</span><span class="token punctuation">]</span><span class="token punctuation">}</span>
            <span class="token keyword">else</span><span class="token punctuation">:</span>
                each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span> <span class="token operator">+=</span> <span class="token number">1</span>
                outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;number&#39;</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span>
                outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span>
                outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span>
                outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">]</span> <span class="token operator">+=</span> data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">3</span><span class="token punctuation">]</span>
    <span class="token keyword">for</span> each <span class="token keyword">in</span> outcome_data<span class="token punctuation">.</span>index<span class="token punctuation">.</span>values<span class="token punctuation">:</span>
        outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;number&#39;</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均发博</span>
        outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;attitudes&#39;</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均点赞数</span>
        outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;comments&#39;</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均评论数</span>
        outcome_data<span class="token punctuation">.</span>loc<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token string">&#39;reposts&#39;</span><span class="token punctuation">]</span> <span class="token operator">/=</span> each_num<span class="token punctuation">[</span>each<span class="token punctuation">]</span><span class="token comment">#取每个博主的日平均转发数</span>
        
    outcome_data <span class="token operator">=</span> outcome_data<span class="token punctuation">.</span>sort_index<span class="token punctuation">(</span>inplace <span class="token operator">=</span> <span class="token boolean">False</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> outcome_data
</code></pre></p>
中小板指与微博热度关系

对比指数与微博热度,鉴于上证指数可能因国家队干预而失真,故选取中小板指作为分析对象。观察2014年牛市初期,中小板指数呈上升趋势,微博热度亦逐渐上升,但初期并未引发公众的高度关注。进入2015年中期,股灾前夕,微博热度急剧攀升,与此同时,中小板指数处于高位,预示着即将迎来暴跌。股灾发生后,微博热度逐步回归至正常水平。

<p><pre>    <code class="prism language-python"><span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">&#39;__main__&#39;</span><span class="token punctuation">:</span>
    outcome_data <span class="token operator">=</span> outcome<span class="token punctuation">(</span><span class="token punctuation">)</span>
    number_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>number<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
    attitudes_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>attitudes<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
    comments_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>comments<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
    reposts_ema <span class="token operator">=</span> ta<span class="token punctuation">.</span>EMA<span class="token punctuation">(</span>outcome_data<span class="token punctuation">.</span>reposts<span class="token punctuation">,</span>timeperiod <span class="token operator">=</span> <span class="token number">30</span><span class="token punctuation">)</span>
</code></pre></p>
不同行情下的热度表现

2017至2018年间,白马股股价攀升,然而微博讨论热度却出现下降趋势。这一现象暗示市场上涨主要由机构推动,例如大型基金集中投资于白马股,散户参与度较低。近期微博热度回升,反映出当前股市上涨中散户参与增多,或许预示着新的牛市到来,尽管这只是其中一种可能性。

数据来源的潜在改进方向

此研究基于微博数据进行了多项分析,但需注意,微博数据并非金融专业数据,存在一定限制。专业投资平台如雪球可能提供更高质量的数据。此外,若将社交舆论监控融入量化交易系统,例如通过插件方式,将有助于更精准地分析股市点位和识别风险,从而显著提高交易决策的精确度。

您是否相信新一轮的牛市将伴随热度增长而出现?敬请于评论区发表您的观点。同时,不妨为本文点赞及转发。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则