kinesis
Kinesis Data Generator is a tool which can be used to generate mock data and send to Kinesis Firehose or Streams. This can be an easy way to test your pipeline with streaming data, if you do not have enough data to play with.
Kinesis Data Generator是一个工具,可用于生成模拟数据并将其发送到Kinesis Firehose或Streams。 如果您没有足够的数据可玩,这可能是一种使用流数据测试管道的简单方法。
In this article, I am using Kinesis Data Generator to send mock Stack Overflow data mimicking the original json structure which I streamed using Stackapi.
在本文中,我将使用Kinesis Data Generator发送模拟的Stack Overflow数据,以模仿我使用Stackapi流式传输的原始json结构。
Step 1: Go to this link https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html & create an Amazon Cognito user and download the CloudFormation template
步骤1:转到此链接https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html并创建Amazon Cognito用户并下载CloudFormation模板
Step 2: Configure the stack- Choose Template is ready option.
步骤2:配置堆栈-选择模板准备就绪选项。
Step 3: Upload the json file that you downloaded in Step 2
步骤3:上传您在步骤2中下载的json文件
Step 4: Specify user name and password.
步骤4:指定用户名和密码。
Step 6: Leave Default Options & Create Stack
步骤6:保留默认选项并创建堆栈
Step 7:Login to Kinesis data generator using the below url and sign in with your credentials
步骤7:使用以下URL登录到Kinesis数据生成器并使用您的凭据登录
https://awslabs.github.io/amazon-kinesis-data-generator/web
https://awslabs.github.io/amazon-kinesis-data-generator/web
Step 8:Generate Streams using the template provided by Kinesis Data Generator
步骤8:使用Kinesis Data Generator提供的模板生成流
You will need to select the region in which the firehose was created
您将需要选择创建消防水带的区域
Choose the number of Records per Second to send
选择每秒发送的记录数
Create a template similar to the stack overflow actual data
创建类似于堆栈溢出实际数据的模板
Sample of the json stack overflow data json堆栈溢出数据的样本Below is a sample I used
以下是我使用的示例
{ “questionid”: {{random.number(100000)}}, “view_count”: {{random.number( { “min”:0, “max”:1000 } )}}, “is_answered”: “{{random.arrayElement( [“True”,”False”] )}}”,“answer_count”: {{random.number( { “min”:0, “max”:20 } )}},“score”: {{random.number( { “min”:0, “max”:50 } )}},“creation_date”: {{random.arrayElement( [1546300800] )}}
{“ questionid”:{{random.number(100000)}},“ view_count”:{{random.number({“ min”:0,“ max”:1000}}}},“ is_answered”:“ {{ random.arrayElement([“ True”,“ False”])}}}“”,“ answer_count”:{{random.number({“ min”:0,“ max”:20}}}}},“得分”:{ {random.number({“ min”:0,“ max”:50})}},“ creation_date”:{{random.arrayElement([1546300800]}}}
}
}
Here, I want the creation_date to be constant and only take the current date.
在这里,我希望creation_date为常数,并且仅采用当前日期。
Step 10 : Send data to Kinesis Firehose
步骤10:将数据发送到Kinesis Firehose
The firehose- streams will be created in your s3 bucket in the format you specified when you created the delivery stream.
firehose-stream将以您在创建传递流时指定的格式在s3存储桶中创建。
翻译自: https://medium.com/@snehamehrin22/how-to-generate-mock-streaming-data-using-kinesis-data-generator-a3dce7d43236
kinesis
相关资源:四史答题软件安装包exe