根据提供的数据,我们可以从以下几个角度进行分析:
假设我们有以下的数据字段:
streamer_name: 直播间名称video_views: 视频浏览量sales_amount: 销售金额fan_count: 粉丝数top_3_video_views_sum: 三个最高视频引流直播间总和import pandas as pd
from scipy.stats import pearsonr
# 假设数据已经读取为DataFrame df
df = pd.read_csv('livestream_data.csv')
# 计算视频引流占比
df['video_view_ratio'] = df['video_views'] / df['fan_count']
# 汇总TOP3直播间的信息
top_3_video_views_sum = df.nlargest(3, 'video_views')['video_views'].sum()
# 计算皮尔逊相关系数
correlation, p_value = pearsonr(df['video_view_ratio'], df['sales_amount'])
print(f"Correlation: {correlation}, P-value: {p_value}")
# 头部效应分析
top_3_ratio = (top_3_video_views_sum / df['video_views'].sum()) * 100
print(f"Top 3 live streams引流占比:{top_3_ratio:.2f}%")
# 高引流占比直播带货类目分布(假设已有分类数据)
high_ratio_groups = df[df['video_view_ratio'] > df['video_view_ratio'].quantile(0.9)]['category']
print(f"高引流比例的前几场直播所对应的带货类目:{high_ratio_groups.unique()}")
# 粉丝数与引流能力的关系
fan_views_corr, _ = pearsonr(df['fan_count'], df['video_view_ratio'])
print(f"粉丝数量与视频引流人数的相关性: {fan_views_corr}")
以上分析数据来源:互联岛