python jupyter notebook ノートブック上で、sparse matrix の要素を出力する

起こったこと

以下のようなスクリプトを実装していました。

スクリプト

from sklearn.feature_extraction.text import CountVectorizer
import numpy as np

# テキスト内の単語の出現頻度を数えて、結果を素性ベクトル化する(Bag of words)
count_vectorizer = CountVectorizer()
# csr_matrix(疎行列)が返る
feature_vectors = count_vectorizer.fit_transform(keywords)
# 学習したデータのみ切り出し
learning_vectors = feature_vectors[:len(read_from_learning_tsv(0))]
# データに対応したラベルを取得
learning_labels = np.array(read_from_learning_tsv(1))

learning_vectors

OUTPUT learning_vectors を内容を確認したいので、標準出力していますが、sparse matrix なので以下出力となり、内容の確認ができません。
```
<940x655 sparse matrix of type '<type 'numpy.int64'>'
    with 2650 stored elements in Compressed Sparse Row format>
```

todense

StackOverFlow で以下の記事を見つけました。
python - How to access sparse matrix elements? - Stack Overflow

sparse matrix の todense()で、通常の行列の形に戻せます。

learning_vectors

を、

learning_vectors. todense()

として以下が出力されました。

OUTPUT

matrix([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ..., 
        [0, 0, 0, ..., 0, 0, 1],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]])

全てが出力されるわけではないので、事実上あまり意味はないかもしれませんが、個人的な気持ち悪さは解消されました。
以上です。

python jupyter notebook ノートブック上で、sparse matrix の要素を出力する

起こったこと

todense

コメント

カテゴリー

最近の投稿

起こったこと

todense

コメント

カテゴリー

関連投稿

最近の投稿