逻辑回归(LR)的归一与否对比
测试数据
x = [[1, 2], [2, 4], [2, 1], [4, 2] ,[4, 200000],[4,22]]
y = [0, 0, 1, 1, 1, 2]
可以看出,上面[4,2]属于分类1,下面分别看归一与不归一的结果:
预测分类: [2]
预测分类概率[0,1,2]: [[0.20144214 0.24605966 0.5524982 ]]
-------以下归一处理--------
预测分类: [2]
预测分类概率[0,1,2]: [[0.25604625 0.33923733 0.40471641]]
可得看出两组数据都分错类了,但归一的结果(0.339 分类1的概率)是优于不归一的(0.246 分类1的概率)
文/程忠 浏览次数:0次 2021-05-21 17:13:27
x = [[1, 2], [2, 4], [2, 1], [4, 2] ,[4, 200000],[4,22]]
y = [0, 0, 1, 1, 1, 2]
可以看出,上面[4,2]属于分类1,下面分别看归一与不归一的结果:
预测分类: [2]
预测分类概率[0,1,2]: [[0.20144214 0.24605966 0.5524982 ]]
-------以下归一处理--------
预测分类: [2]
预测分类概率[0,1,2]: [[0.25604625 0.33923733 0.40471641]]
可得看出两组数据都分错类了,但归一的结果(0.339 分类1的概率)是优于不归一的(0.246 分类1的概率)
python代码:
以下代码为避免出现0引起的错误,归为[0.1,0.99]
from sklearn import tree from sklearn2pmml.pipeline import PMMLPipeline from sklearn2pmml import sklearn2pmml from nyoka import skl_to_pmml from numpy import * #from sklearn.linear_model import LogisticRegressionCV from sklearn.linear_model import LogisticRegression minArray = [] maxArray = [] def main(): x = [[1, 2], [2, 4], [2, 1], [4, 2] ,[4, 200000],[4,22]] y = [0, 0, 1, 1, 1, 2] lr = LogisticRegression(class_weight='balanced') pipeline = PMMLPipeline([("classifier", lr)]) pipeline.fit(x, y) #sklearn2pmml(pipeline, "./demo.pmml", with_repr=True) all_col_names= ["a", "b"] skl_to_pmml(pipeline, all_col_names, "a,b", "demo.pmml") print("[4,2]原始分类1") #print("预测分类:",lr.predict([[4, 2]])) print("预测分类:", lr.predict(([[4, 2]]))) print("预测分类概率[0,1,2]:",lr.predict_proba(([[4, 2]]))) print("-------以下归一处理--------") x = toOne(x) pipeline.fit(x, y) print("预测分类:", lr.predict(dealOne([[4, 2]]))) print("预测分类概率[0,1,2]:", lr.predict_proba(dealOne([[4, 2]]))) def toOne(martix): for j in range(len(martix[0])): one_list = [] for i in range(len(martix)): one_list.append(int(martix[i][j])) maxArray.append(max(one_list)) minArray.append(min(one_list)) #print(maxArray) #print(minArray) return dealOne(martix) def dealOne(martix): for i in range(len(martix)): for j in range(len(martix[0])): martix[i][j] = (martix[i][j]-minArray[j])/(maxArray[j]-minArray[j])*0.98+0.01 return martix if __name__ == "__main__": main()
相关阅读
评论:
↓ 广告开始-头部带绿为生活 ↓
↑ 广告结束-尾部支持多点击 ↑