lstm回归预测模型

科技2025-02-19 84

lstm回归预测模型

加密货币和神经网络 (Cryptocurrency and Neural Network)

Prediction of stock price is a quite challenging task due to volatile and non-linear nature of the financial stock markets. Here, the problem we have in had is a price prediction issue and we’re trying to predict a numerical value defined in a very large range (from 9000 to 12500 approx). This problem fits the Regression Analysis framework. We shall be using neural networking to try to solve the problem here. Motivation of using neural network is that, it is one of the intelligent data mining techniques that identify a fundamental trend from data and to generalize from it.

股票价格的对 rediction是由于金融证券市场的波动和非线性性质相当具有挑战性的任务。在这里，我们遇到的问题是价格预测问题，我们试图预测一个很大范围内定义的数值(大约从9000到12500)。这个问题适合于回归分析框架。我们将使用神经网络来尝试解决这里的问题。使用神经网络的动机是，它是从数据中识别出基本趋势并从中进行概括的一种智能数据挖掘技术。

We’ll build a Deep Neural Network here that does some forecasting for us and use it to predict future price. Let us load the hourly frequency data.

宽E“会在这里建立一个深层神经网络，做一些预测为我们，并用它来预测未来的价格。让我们加载每小时频率数据。

资料载入 (Data loading)

We have a total of 2001 data points representing Bitcoin in USD . We’re interested in predicting the closing price for future dates.

我们总共有2001个数据点代表美元的比特币。我们有兴趣预测未来日期的收盘价。

When we see a time-series, we always want to know if the value of the current time step affects the next one.

当我们看到一个时间序列时，我们总是想知道当前时间步长的值是否影响下一个时间步长。

plt.figure(figsize = (15,5))plt.plot(btc.close)plt.title('BTC Close price (Hourly frequency)')plt.xlabel ('Date_time')plt.ylabel ('Price (US$')plt.show()

We shall use LSTM network here which has the ability to capture long-term dependencies in a sequence (e.g. dependency between today‘s price and that 2 weeks ago). Moreover, uni-variate series is being used here considering only the close price from the series.

我们将在这里使用LSTM网络，该网络具有捕获序列中长期依赖关系的能力(例如，今天的价格与2周前的价格之间的依赖关系)。此外，此处仅考虑单价的收盘价，使用单变量序列。

Let us normalize the price data using MinMax scaler.

让我们使用MinMax缩放器将价格数据标准化。

数据标准化 (Data normalization)

# Feature Scaling Normalizationscaler = MinMaxScaler() # min-max normalization and scale the features in the 0-1 range.close_price = btc['close'].values.reshape(-1, 1) # The scaler expects the data to be shaped as (x, y)scaled_close = scaler.fit_transform(close_price)scaled_close = scaled_close[~np.isnan(scaled_close)] # removing NaNs (if any)scaled_close = scaled_close.reshape(-1, 1) # reshaping data after removing NaNs

LSTM的数据处理 (Data processing for LSTM)

LSTMs require 3-D data shape; therefore, we need to split the data into the shape of : [batch_size, sequence_length, n_features]. We also want to save some data for testing.

LSTM需要3-D数据形状；因此，我们需要将数据分割为以下形状：[batch_size，sequence_length，n_features]。我们还想保存一些数据进行测试。

Let’s build some sequences. Sequences work like walk forward validation approach, where initial sequence length will be defined and subsequently will be shifting one position to the right to create another sequence. This way the process is repeated until all possible positions are used.

让我们构建一些序列。序列的工作方式类似于前向验证方法，其中将定义初始序列长度，随后将向右移动一个位置以创建另一个序列。以这种方式重复该过程，直到使用了所有可能的位置。

SEQ_LEN = 100 # creating a sequence of 100 hours at position 0.def to_sequences(data, seq_len):d = []for index in range(len(data) - seq_len):d.append(data[index: index + seq_len])return np.array(d)def preprocess(data_raw, seq_len, train_split):data = to_sequences(data_raw, seq_len)num_train = int(train_split * data.shape[0])X_train = data[:num_train, :-1, :]y_train = data[:num_train, -1, :]X_test = data[num_train:, :-1, :]y_test = data[num_train:, -1, :]return X_train, y_train, X_test, y_test"""Walk forward validation: Initial SEQ_LEN is defined above, so, walk forward will be shifting one position to the right and create another sequence.The process is repeated until all possible positions are used."""X_train, y_train, X_test, y_test = preprocess(scaled_close, SEQ_LEN, train_split = 0.95) # 5% of the data saved for testing.print(X_train.shape, X_test.shape)"""Our model will use 1805 sequences representing 99 hours of Bitcoin price changes each for training. We shall be predicting the price for 96 hours in the future"""

建立LSTM模型 (Building LSTM model)

We will create a 3 layer LSTM Network using Dropout with a rate of 20% to control over-fitting during training.

我们将使用Dropout创建20％的三层LSTM网络，以控制训练期间的过度拟合。

DROPOUT = 0.2 # 20% Dropout is used to control over-fitting during trainingWINDOW_SIZE = SEQ_LEN - 1model = keras.Sequential()# Input layermodel.add(Bidirectional(LSTM(WINDOW_SIZE, return_sequences=True), input_shape=(WINDOW_SIZE, X_train.shape[-1])))"""Bidirectional RNNs allows to train on the sequence data in forward and backward direction."""model.add(Dropout(rate=DROPOUT))# 1st Hidden layermodel.add(Bidirectional(LSTM((WINDOW_SIZE * 2), return_sequences = True)))model.add(Dropout(rate=DROPOUT))# 2nd Hidden layermodel.add(Bidirectional(LSTM(WINDOW_SIZE, return_sequences=False)))# output layermodel.add(Dense(units=1))model.add(Activation('linear'))"""Output layer has a single neuron (predicted Bitcoin price). We use Linear activation function which activation is proportional to the input."""BATCH_SIZE = 64model.compile(loss='mean_squared_error', optimizer='adam')history = model.fit(X_train, y_train, epochs=50, batch_size=BATCH_SIZE, shuffle=False, validation_split=0.1) # shuffle not advisable during training of Time Series

We can use callbacks option during training to prevent our model from over-fitting too; I have not used call-back, but same can be applied during model fitting phase.

我们可以在训练期间使用回调选项来防止我们的模型过度拟合。我没有使用过回调，但是可以在模型拟合阶段应用。

模型评估 (Model Evaluation)

A simple way to understand the training process is view the training and validation loss.

了解培训过程的一种简单方法是查看培训和验证损失。

# history for lossplt.figure(figsize = (10,5))plt.plot(history.history['loss'])plt.plot(history.history['val_loss'])plt.title('model loss')plt.ylabel('loss')plt.xlabel('epoch')plt.legend(['train', 'test'], loc='upper left')plt.show()

Here, we can see some improvement in both the training error and on the validation error.

在这里，我们可以看到训练错误和验证错误都有一些改善。

测试中 (Testing)

We have some additional data left for testing purpose. Let’s get the predictions from the model using those data to validate the goodness of fit of our model.

我们还有一些其他数据可用于测试。让我们使用这些数据从模型中获得预测，以验证模型的拟合优度。

# prediction on test datay_pred = model.predict(X_test) # invert the test to original valuesy_test_inverse = DataFrame(scaler.inverse_transform(y_test)) # assigning datetimey_test_inverse.index = btc.index[-len(y_test):] print('Test data:',)print(y_test_inverse.tail(3)); print();# invert the prediction to understandable valuesy_pred_inverse = DataFrame(scaler.inverse_transform(y_pred)) # assigning datetimey_pred_inverse.index = y_test_inverse.index print('Prediction data:',)print(y_pred_inverse.tail(3))

准确性指标 (Accuracy metrics)

print(f'MAE {mean_absolute_error(y_test, y_pred)}')print(f'MSE {mean_squared_error(y_test, y_pred)}')print(f'RMSE {np.sqrt(mean_squared_error(y_test, y_pred))}')print(f'R2 {r2_score(y_test, y_pred)}')

RMSE allows us to penalize points further from the mean. Though the error scores are low but looking at R2 score, there is definitely room for improvement here. The model can be tuned to get better output.

RMSE允许我们对均值以外的点进行惩罚。尽管错误分数很低，但从R2分数来看，这里肯定有改进的空间。可以对模型进行调整以获得更好的输出。

性能可视化 (Performance visualization)

plt.figure(figsize = (15,5))plt.plot(y_test_inverse)plt.plot(y_pred_inverse)plt.title('Actual vs Prediction plot (Price prediction model)')plt.ylabel('price')plt.xlabel('date')plt.legend(['actual', 'prediction'], loc='upper left')plt.show()

Looks like our basic and brief model has been able to capture the general pattern of the data. However, it failed to capture stochastic movements, which may be a good sign in terms of we can say that, it generalizes well.

看起来我们的基本模型和简要模型已经能够捕获数据的一般模式。但是，它未能捕获随机运动，就我们可以说，它可以很好地推广，这可能是一个好兆头。

结论与未来范围 (Conclusion & future scope)

Predicting stock market returns is a challenging task due to consistently changing stock values which are dependent on multiple parameters which form complex patterns.

由于不断变化的股票价值依赖于形成复杂模式的多个参数，因此预测股票市场的回报是一项艰巨的任务。

Future direction could be:

未来的方向可能是：

analyzing the correlation between different cryptocurrencies and how would that affect the performance of our model.

分析不同加密货币之间的相关性以及这将如何影响我们模型的性能。 adding features using technical analysis to check the model performance.

使用技术分析添加功能以检查模型性能。 adding features from fundamental analysis to check how those affect the mode.

从基础分析中添加功能，以检查这些功能如何影响模式。 adding sentiment analysis from social networking e.g.twitter and new report to check model performance.

添加来自社交网络(例如twitter)的情感分析和新报告以检查模型性能。 GRU network can also be tried with different activation e.g. ‘softsign” to check the performance.

也可以尝试使用不同的激活方式来尝试GRU网络，例如“ softsign”以检查性能。

Moreover, multivariate analysis would require a great amount of effort and time on feature engineering, data analysis, model training etc. For now, we can save our model for future reference.

此外，多变量分析在功能工程，数据分析，模型训练等方面需要大量的精力和时间。目前，我们可以保存我们的模型以供将来参考。

Connect me here.

在这里连接我。

Vijh, M., Chandola, D., Tikkiwal, V. A., & Kumar, A. (2020). Stock Closing Price Prediction using Machine Learning Techniques. Procedia Computer Science, 167, 599–606.

Vijh，M.，Chandola，D.，Tikkiwal，VA，and Kumar，A.(2020年)。使用机器学习技术的股票收盘价预测。 Procedia计算机科学，167，599–606。

Vanelin Valkov (2015). Hackers guide to machine learning

Vanelin Valkov(2015年)。黑客机器学习指南

翻译自: https://medium.com/@sarit.maitra/regression-analysis-lstm-network-to-predict-future-prices-b95dc0db6fcc

lstm回归预测模型

相关资源：论文研究-基于LSTM神经网络模型的交通事故预测.pdf

Processed: 0.012, SQL: 8