kafka 配置文件详解(producer.properties)

Producer作为kafak核心组件之一,学习和分析它是很有必要的。producer配置文件在kafak的config目录下,配置好它
在以后的学习和工作中都可以起到事半功倍的效果。
【kafka 配置文件详解(producer.properties)】

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.You may obtain a copy of the License at
#
#http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.producer.ProducerConfig for more details


############################# Producer Basics #############################


# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
#指定节点列表(必填,集群需要写各个节点,我这里是单机)

metadata.broker.list=localhost:9092


# name of the partitioner class for partitioning events; default partition spreads data randomly
#指定分区处理类。默认kafka.producer.DefaultPartitioner
partitioner.class=kafka.producer.DefaultPartitioner


# specifies whether the messages are sent asynchronously (async) or synchronously (sync)
#sync同步(默认),async异步可以提高发送吞吐量

producer.type=sync


# specify the compression codec for all data generated: none, gzip, snappy, lz4.
# the old config values work as well: 0, 1, 2, 3 for none, gzip, snappy, lz4, respectively
#是否压缩,0代表不压缩,1代表用gzip压缩,2代表用snappy压缩
compression.codec=0



# message encoder
#指定序列化处理类
serializer.class=kafka.serializer.DefaultEncoder




# allow topic level compression
#如果要压缩消息,这里指定哪些topic要压缩消息,默认是empty,表示不压缩
#compressed.topics=




#设置发送数据是否需要服务端的反馈,有三个值0,1,-1
# 0:producer不会等待broker发送ack
# 1:当leader接收到消息后发送ack
# -1:当所有的follower都同步消息成功后发送ack
request.required.acks=0




############################# Async Producer #############################
# maximum time, in milliseconds, for buffering data on the producer queue
#在async模式下,当message缓存超时后,将会批量发送给broker,默认5000ms
queue.buffering.max.ms=5000


# the maximum size of the blocking queue for buffering on the producer
#在async模式下,Producer端允许buffer的最大消息量
queue.buffering.max.messages=20000




#在向producer发送ack之前,broker均需等待的最大时间

request.timeout.ms=10000


# Timeout for event enqueue:
# 0: events will be enqueued immediately or dropped if the queue is full
# -ve: enqueue will block indefinitely if the queue is full
# +ve: enqueue will block up to this many milliseconds if the queue is full
#当消息在producer端沉积的条数达到“queue.buffering.max.messages"后
#阻塞一定时间后,队列仍然没有enqueue(producer仍然没有发送出任何消息)
#此时producer可以继续阻塞,或者将消息抛弃
# -1:无阻塞超时限制,消息不会被抛弃
# 0 :立即清空队列,消息被抛弃
queue.enqueue.timeout.ms=-1


# the number of messages batched at the producer
#在async模式下,指定每次批量发送的数据量,默认200
batch.num.messages=200


补充:producer.properties的详细配置

属性 默认值 描述
metadata.broker.list 启动时producer查询brokers的列表,可以是集群中所有brokers的一个子集。注意,这个参数只是用来获取topic的元信息用,producer会从元信息中挑选合适的broker并与之建立socket连接。格式是:host1:port1,host2:port2。
request.required.acks 0 参见3.2节介绍
request.timeout.ms 10000 Broker等待ack的超时时间,若等待时间超过此值,会返回客户端错误信息。
producer.type sync 同步异步模式。async表示异步,sync表示同步。如果设置成异步模式,可以允许生产者以batch的形式push数据,这样会极大的提高broker性能,推荐设置为异步。
serializer.class kafka.serializer.DefaultEncoder 序列号类,.默认序列化成 byte[] 。
key.serializer.class Key的序列化类,默认同上。
partitioner.class kafka.producer.DefaultPartitioner Partition类,默认对key进行hash。
compression.codec none 指定producer消息的压缩格式,可选参数为: “none”, “gzip” and “snappy”。关于压缩参见4.1节
compressed.topics null 启用压缩的topic名称。若上面参数选择了一个压缩格式,那么压缩仅对本参数指定的topic有效,若本参数为空,则对所有topic有效。
message.send.max.retries 3 Producer发送失败时重试次数。若网络出现问题,可能会导致不断重试。
retry.backoff.ms 100 Before each retry, the producer refreshes the metadata of relevant topics to see if a new leader has been elected. Since leader election takes a bit of time, this property specifies the amount of time that the producer waits before refreshing the metadata.
topic.metadata.refresh.interval.ms 600 * 1000 The producer generally refreshes the topic metadata from brokers when there is a failure (partition missing, leader not available…). It will also poll regularly (default: every 10min so 600000ms). If you set this to a negative value, metadata will only get refreshed on failure. If you set this to zero, the metadata will get refreshed after each message sent (not recommended). Important note: the refresh happen only AFTER the message is sent, so if the producer never sends a message the metadata is never refreshed
queue.buffering.max.ms 5000 启用异步模式时,producer缓存消息的时间。比如我们设置成1000时,它会缓存1秒的数据再一次发送出去,这样可以极大的增加broker吞吐量,但也会造成时效性的降低。
queue.buffering.max.messages 10000 采用异步模式时producer buffer 队列里最大缓存的消息数量,如果超过这个数值,producer就会阻塞或者丢掉消息。
queue.enqueue.timeout.ms -1 当达到上面参数值时producer阻塞等待的时间。如果值设置为0,buffer队列满时producer不会阻塞,消息直接被丢掉。若值设置为-1,producer会被阻塞,不会丢消息。
batch.num.messages 200 采用异步模式时,一个batch缓存的消息数量。达到这个数量值时producer才会发送消息。
send.buffer.bytes 100 * 1024 Socket write buffer size
client.id “” The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request.



    推荐阅读