kinesis

    科技2025-03-15  18

    kinesis

    Kinesis Data Generator is a tool which can be used to generate mock data and send to Kinesis Firehose or Streams. This can be an easy way to test your pipeline with streaming data, if you do not have enough data to play with.

    Kinesis Data Generator是一个工具,可用于生成模拟数据并将其发送到Kinesis Firehose或Streams。 如果您没有足够的数据可玩,这可能是一种使用流数据测试管道的简单方法。

    In this article, I am using Kinesis Data Generator to send mock Stack Overflow data mimicking the original json structure which I streamed using Stackapi.

    在本文中,我将使用Kinesis Data Generator发送模拟的Stack Overflow数据,以模仿我使用Stackapi流式传输的原始json结构。

    Step 1: Go to this link https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html & create an Amazon Cognito user and download the CloudFormation template

    步骤1:转到此链接https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html并创建Amazon Cognito用户并下载CloudFormation模板

    Step 2: Configure the stack- Choose Template is ready option.

    步骤2:配置堆栈-选择模板准备就绪选项。

    Step 3: Upload the json file that you downloaded in Step 2

    步骤3:上传您在步骤2中下载的json文件

    Step 4: Specify user name and password.

    步骤4:指定用户名和密码。

    Step 6: Leave Default Options & Create Stack

    步骤6:保留默认选项并创建堆栈

    Step 7:Login to Kinesis data generator using the below url and sign in with your credentials

    步骤7:使用以下URL登录到Kinesis数据生成器并使用您的凭据登录

    https://awslabs.github.io/amazon-kinesis-data-generator/web

    https://awslabs.github.io/amazon-kinesis-data-generator/web

    Step 8:Generate Streams using the template provided by Kinesis Data Generator

    步骤8:使用Kinesis Data Generator提供的模板生成流

    You will need to select the region in which the firehose was created

    您将需要选择创建消防水带的区域

    Choose the number of Records per Second to send

    选择每秒发送的记录数

    Create a template similar to the stack overflow actual data

    创建类似于堆栈溢出实际数据的模板

    Sample of the json stack overflow data json堆栈溢出数据的样本

    Below is a sample I used

    以下是我使用的示例

    { “questionid”: {{random.number(100000)}}, “view_count”: {{random.number( { “min”:0, “max”:1000 } )}}, “is_answered”: “{{random.arrayElement( [“True”,”False”] )}}”,“answer_count”: {{random.number( { “min”:0, “max”:20 } )}},“score”: {{random.number( { “min”:0, “max”:50 } )}},“creation_date”: {{random.arrayElement( [1546300800] )}}

    {“ questionid”:{{random.number(100000)}},“ view_count”:{{random.number({“ min”:0,“ max”:1000}}}},“ is_answered”:“ {{ random.arrayElement([“ True”,“ False”])}}}“”,“ answer_count”:{{random.number({“ min”:0,“ max”:20}}}}},“得分”:{ {random.number({“ min”:0,“ max”:50})}},“ creation_date”:{{random.arrayElement([1546300800]}}}

    }

    }

    Here, I want the creation_date to be constant and only take the current date.

    在这里,我希望creation_date为常数,并且仅采用当前日期。

    Step 10 : Send data to Kinesis Firehose

    步骤10:将数据发送到Kinesis Firehose

    The firehose- streams will be created in your s3 bucket in the format you specified when you created the delivery stream.

    firehose-stream将以您在创建传递流时指定的格式在s3存储桶中创建。

    翻译自: https://medium.com/@snehamehrin22/how-to-generate-mock-streaming-data-using-kinesis-data-generator-a3dce7d43236

    kinesis

    相关资源:四史答题软件安装包exe
    Processed: 0.017, SQL: 8