Error connection reset by peer redis - Исправление ошибок и поиск оптимальных решений проблем

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and
privacy statement. We’ll occasionally send you account related emails.

Already on GitHub?
Sign in
to your account

Closed

venthur opened this issue

Jul 9, 2019

· 35 comments

Comments

Version:
redis-py: 3.2.1
redis server: 5.0.5

Platform: Debian

Description:
I have a connection to the Server that is idle for a few hours, and when I try to perform an operation, I get a redis.exceptions.ConnectionError: Error while reading from socket: (104, 'Connection reset by peer'). This happens not always but often enough to be annoying. I can wrap it in a try-except and just reconnect and it will work, however i don’t want to wrap every command to redis like this.

Some Background: The program is getting tasks from a Queue implemented in Redis, computing for a few hours, getting the next task, and so on.

The question is, is this a bug or expected? Is there an option to prevent this? I don’t have any timeouts set on the client or server (everything is default). The current workaround would be to emulate a pre-ping, by sending a ping wrapped in a try-except-and-reconnect-if-needed, but it’s not very pretty.

lucadepaoli, macropin, tiggerntatie, hassek, jacopofar, carlokohan, morgan, zzzsochi, mohammadmasoumi, souradeep-nanda, and 14 more reacted with thumbs up emoji

Similar problem here, with python 2.7.14.

@venthur @azdkj532 Are you running under Kubernetes?

Hi, sorry I haven’t responded to this issue earlier. Upgrading to version 3.3.x should improve your experience. 3.3.x added a new option health_check_interval. You can use health checks like:

client = redis.Redis(..., health_check_interval=30)

The value of health_check_interval specifies the time in seconds a connection can be idle before its health needs to be checked. In the above case, if the connection is idle for more than 30 seconds, a round trip PING/PONG will be attempted just before your next command. If the PING/PONG fails, the connection will reestablished behind the scenes before your intended command is executed.

dminovski0, evgmoskalenko, AlyaGomaa, jgmartinss, Dimama, winstongood, and ronalddas reacted with thumbs up emoji
hassek, jacopofar, dylancaponi, Alveona, gimmeao, rafaelbattesti, RoeyPrat, emilwareus, jgmartinss, winstongood, and thoongnv reacted with hooray emoji
CauaneAndrade, xlizju, winstongood, and kovalevvlad reacted with confused emoji

I have this problem since 3.0 got released…

@dejlek upgrade to the latest 3.3.x and turn on health checks.

@dejlek for now the health checks are opt in.

We are currently testing GCP MemoryStore as a replacement for memcache.

Using GAE Python 2.7, we get random «ConnectionError: Error while reading from socket: (104, ‘Connection reset by peer’)». Typically this error occurs after about 120 seconds of waiting for some redis command to complete (delete, set, get), so they can be hard to handle with a backoff mechanism.

We use version 3.3.8. I’ve been testing using health_check_interval=30 and lower values than 30 (down to 2, currently). This seems to have made the errors less frequent, but they still occur often enough to be of concern.

Perhaps this is a purely MemoryStore/redis server issue, however.

I have the same issue as @ulodciv. I’m considering whether it has anything to do with the beta VPC needed to talk to MemoryStore. I tried the same setup with Redis running on a normal Container and the same thing happens.

Hi @andymccurdy sorry, for the late reply. We’re testing it now, it might take some time to say with confidence if it worked. But looking at the code I’m quite positive.

I am also running with GCP MemoryStore, standard environment with a vpc access connector, and I get seemingly random 104 Connection reset errors (a string of 4 or 5 within a few seconds, about once a day). I’ve tried the health_check_interval fix, tcp_keepalive, nothing seems to work. What does seem to work is switching to the flex environment (so dropping the vpc access connector) or dropping back down to redis-py 3.0.1 (so the socket reconnect in redis-py still works). I can’t readily put in reconnection logic into my code because the cookie management library I’m using (flask-session) uses redis. I’d really be happy if the socket-reconnect logic got put back into redis-py, if only as an option.

@gimmeao It sounds like something in the VPC access connector environment is severing your connections. We’d need more info to figure out what’s going on there.

What value are you supplying to health_check_interval?

It looks like the VPC access connector times out idle connections, but this is configurable. You should make sure that health_check_interval is less than the idle timeout value in your environment.

The socket reconnect logic was a bug and can lead to data loss/corruption. There’s no plan to reintroduce that. However, if that’s really what you want to do, you could subclass the Connection class and do the retry yourself, assuming that you can supply connection options to the library that’s using redis-py.

I tried 30 and 15 for health_check_interval. I’m not sure that the timeout
is configurable on Google’s side, at least if it is I haven’t found where
and I’ve looked quite hard. In any case I’m pretty much just using redis as
a database cache that I rarely write to, so I think I can live with the
potential issues. Could be it’s a bug on G’s side, I’ll probably try again
in a few months and see if anything has changed. Thanks for the response.

…

On Thu, Mar 12, 2020 at 9:15 PM Andy McCurdy ***@***.***> wrote:
@gimmeao <https://github.com/gimmeao> It sounds like something in the VPC
access connector environment is severing your connections. We’d need more
info to figure out what’s going on there.

What value are you supplying to health_check_interval?

It looks like the VPC access connector times out idle connections, but
this is configurable. You should make sure that health_check_interval is
less than the idle timeout value in your environment.

The socket reconnect logic was a bug and can lead to data loss/corruption.
There’s no plan to reintroduce that. However, if that’s really what you
want to do, you could subclass the Connection class and do the retry
yourself, assuming that you can supply connection options to the library
that’s using redis-py.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1186 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMDT3C5J4IDEMA3CNL7DVTRHGXNTANCNFSM4H7CYW5A>
.

—
Sent from my Commodore SuperPET 9000

Closing this as 3.3.x+ has provided options to resolve this issue. It seems like the only outstanding issue is specific to Google Cloud.

If I’ve missed something please open a new issue.

The socket reconnect logic was a bug and can lead to data loss/corruption.

Is that just because commands might get executed more than once if the connection is severed after the server received the command? Or is there something more insidious that I should be aware of if I plan to implement my own retry logic for idempotent commands?

@bmerry Just the duplicate command execution. The severity depends on the command. Retrying a SET is probably not a problem as the key just gets overwritten with the same value. Other commands, such as all the list mutation commands, are much more problematic.

Thanks. Our application uses only idempotent commands (SETNX, HSETNX, ZADD) so I’ll look at implementing retries at the application level.

Do you have any recommendations for application-level retry logic that ensures that a fresh connection will be attempted? I’m worrying about the case where I have a pool with lots of connections and something in the network severs all of them. If I just retry N times (fixed N, which is less than the size of the pool) on the redis.Redis object, it could presumably just cycle through the dead pool connections without ever trying to establish a new one?

It’s easier than that. If you get a TimeoutError or ConnectionError, you can simply call .disconnect() followed by .connect() on the connection that raised the error. You could do this by subclassing Redis.execute_command and putting your retry logic there.

Thanks, I think I understand. I’ll try that if health_check_interval doesn’t fix my issues — still not sure what causes connections to drop in the first place since we’re running a bare-metal network and nothing should be monkeying around at the TCP layer.

Related to GCP issues, it appears to have gone away after I changed to Redis version 5 inside MemoryStore. I have only been testing it for a day on version 5, but so far I haven’t seen this error again and I used to see it all the time.

For all of you who had issues with GCP, are you guys still facing this problem? We have health check intervals setup and are on Redis 5 — not sure what else to try. We keep getting hundreds of these errors each day.

For all of you who had issues with GCP, are you guys still facing this problem? We have health check intervals setup and are on Redis 5 — not sure what else to try. We keep getting hundreds of these errors each day.

I haven’t had any more issues, I’m running redis v5 and have it running in the same region as my app.

I no longer have the issue, I made a few changes. Not totally sure that my changes are what fixed it but it’s possible. In my main library’s init I added a connect_to_cache method:

def connect_to_cache():
    global r
    r = redis.Redis(
            host=redis_host, port=redis_port, health_check_interval=10,
            socket_timeout=10, socket_keepalive=True,
            socket_connect_timeout=10, retry_on_timeout=True
            )

and I added a route for GCP’s warmup request:

@app.route('/_ah/warmup/', methods=['GET'])
def warmup():
    shfl.connect_to_cache()
    return ('OK', 200)

@tastypackets Yes, every few minutes this error appears, the config changes suggested here don’t do the trick for me.

I’ve thrown the kitchen sink at this problem with an application I’m working on. It has not yet seen enough action to conclude one way or other, but I am posting it here on off chance it may help someone. The big point is configuring keepalive options beyond the boolean socket_keepalive. In particular, the default value for TCP_KEEPIDLE is 7200 seconds (my understanding is that there is an RFC specifying 2 hours as the minimum acceptable default for TCP).

            if platform == "linux":
                ka_options = {
                    socket.TCP_KEEPIDLE: 10,
                    socket.TCP_KEEPINTVL: 5,
                    socket.TCP_KEEPCNT: 5
                }
            elif platform == "darwin":
                # python 3.10 will support socket.TCP_KEEPALIVE on MacOS which
                # is direct analog to TCP_KEEPIDLE on Linux
                ka_options = {socket.TCP_KEEPINTVL: 5, socket.TCP_KEEPCNT: 5}
            else:
                ka_options = {}

            connection = redis.StrictRedis(host=Config.REDIS_HOST,
                                           port=Config.REDIS_PORT,
                                           health_check_interval=15,
                                           socket_keepalive=True,
                                           socket_keepalive_options=ka_options)

I’m able to reproduce this scenario with following setup.

import json
import redis
import time

from redis.retry import Retry
from redis.exceptions import (TimeoutError, ConnectionError)
from redis.backoff import ExponentialBackoff, NoBackoff

"""
Timeout Error Investigation
"""
redis_client = redis.StrictRedis(
    retry=Retry(NoBackoff(), 5), retry_on_timeout=True,
    health_check_interval=5,
    retry_on_error=[TimeoutError, ConnectionError]
)

# reproduce error with connection reset
print(redis_client.execute_command('CONFIG SET timeout 10'))
print(redis_client.set("spock", "vulcan"))
print(redis_client.get("spock"))
print(redis_client.execute_command('RESET'))
time.sleep(121)
try:
    print(redis_client.set("spock", "vulcan"))
except ConnectionResetError:
    # it will be raised
    print("Connection reset")
# this time it will be successful
print(redis_client.set("spock", "vulcan"))

configuration

redis server - 6.2.6
python redis - 4.1.0

UPDATE: I’ve updated my sample code that reproduces connect reset by peer error.

Hey guys, there is no fix for the issue?

I am also on GCP and also having random issues. I am using django_cache which is using redis-py under the hood. I am providing socket_keepalive and health_check_interval but still having mentioned issue from time to time, at least once a day.

If you are using NDB to interact with the Datastore, make sure to pass to NDB the Redis client you constructed yourself with the above suggested parameters instead of using ndb.RedisCache.from_environment()
This was what solved this problem for us.

Hello guys, after debugging the code with TCP dump and ipdb inside python I finally discovered why the provided here solutions were not working or working but not always.

The next portion of discussion will be related to the people who are using django-redis as their CACHE backend configuration and not redis-py directly.

The problem appeared in the configuration of the RedisClient, as we should setup the configuration for the CONNECTION_POOL_KWARGS and not for REDIS_CLIENT_KWARGS.

Currently we are having hte following configuration and health-check with PING-PONG commands is working correctly.

CACHES = {
    'default': {
        ...
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
            'SOCKET_CONNECT_TIMEOUT': 10,
            'SOCKET_TIMEOUT': 60,
            'CONNECTION_POOL_KWARGS': {  # <-- note that here we are using the kwargs for connection pool
                'socket_keepalive': True,
                'health_check_interval': 30,  # <-- interval for the health-check if the last command executed longer that 30 seconds ago
                'retry_on_timeout': True,  # <-- perform retry on timeout (just in case)
                'retry': Retry(FullJitterBackoff(cap=5, base=1), 5),  # <-- which retry logic to apply, if you will not specify it, you will have 0 retries, meaning that if `PING-PONG` query is failed, it will hard fail instantly and you will see the traceback of the health-check command
            },
        },
    }
}

Not everything is working and retried for 5 times in case of ConnectionError or TimeoutError. Also if you need to handle special exceptions, you can specify them using special kwarg.

This was referenced

Sep 20, 2022

Is there any way to reproduce Connection reset by peer?
client kill not working at my environment
os: mac, redis on docker

Источник

I am using elastic cache single node shard redis 4.0 later version.

I enabled In-Transit Encryption and gave redis auth token.

I created one bastion host with stunnal using this link

https://aws.amazon.com/premiumsupport/knowledge-center/elasticache-connect-redis-node/

I am able to connect to elastic cache redis node using following way

redis-cli -h hostname -p 6379 -a mypassword

and i can do telnet also.
BUT
when I ping (expected response «PONG») on redis-cli after connection it is giving

«Error: Connection reset by peer «

I checked security group of both side.
Any idea ?
Bastion Host ubuntu 16.04 machine

LeoMurillo

5,6981 gold badge18 silver badges33 bronze badges

asked Sep 29, 2018 at 18:16

Shree PrakashShree Prakash

1,9822 gold badges21 silver badges33 bronze badges

As I mentioned in question, I was running the command like this:

redis-cli -h hostname -p 6379 -a mypassword

The correct way to connect into a ElastiCache cluster through stunnel should be using «localhost» as the host address,like this:

redis-cli -h localhost -p 6379 -a mypassword

There is explanation for using the localhost address:

when you create a tunnel between your bastion server and the ElastiCache host through stunnel, the program will start a service that listen to a local TCP port (6379), encapsulate the communication using the SSL protocol and transfer the data between the local server and the remote host.

you need to start the stunnel, check if the service is listening on the localhost address (127.0.0.1), and connect using the «localhost» as the destination address: «

Start stunnel. (Make sure you have installed stunnel using this link https://aws.amazon.com/premiumsupport/knowledge-center/elasticache-connect-redis-node/)

$ sudo stunnel /etc/stunnel/redis-cli.conf
Use the netstat command to confirm that the tunnels have started:

$ netstat -tulnp | grep -i stunnel
You can now use the redis-cli to connect to the encrypted Redis node using the local endpoint of the tunnel:

$redis-cli -h localhost -p 6379 -a MySecretPassword

localhost:6379>set foo «bar»

OK

localhost:6379>get foo

«bar»

answered Oct 8, 2018 at 13:50

Shree PrakashShree Prakash

1,9822 gold badges21 silver badges33 bronze badges

«Error: Connection reset by peer» indicates that Redis is killing your connection without sending any response.

One possible cause is you are trying to connect to the Redis node without using SSL, as your connection will get rejected by the Redis server without a response [1]. Make sure you are connecting through the correct port in your tunnel proxy. If you are connecting directly from the bastion host, you should be using local host.

Another option is that you have incorrectly configured your stunnel to not include a version of SSL that is supported by Redis. You should double check the config file is exactly the same as the one provided in the support doc.

It that doesn’t solve your problem, you can try to build the cli included in AWS open source contribution.[2] You’ll need to check out the repository, follow the instructions in the readme, and then do make BUILD_SSL=yes make redis-cli.

[1] https://github.com/madolson/redis/blob/unstable/src/ssl.c#L464
[2] https://github.com/madolson/redis/blob/unstable/SSL_README.md

answered Oct 2, 2018 at 16:16

Источник

I tried to forward guest port 6379 to host 6379 and to 16379 but with no luck.

I can connect to redis from guest and set and get, and despite I also can connect to redis from host and get help, I cannot set or get.

I got no firewall running on guest, or host. Any help appreciated.

From host:

host: > redis-cli -h localhost -p 16379
localhost:16379> help
redis-cli 2.8.4
Type: "help @<group>" to get a list of commands in <group>
      "help <command>" for help on <command>
      "help <tab>" to get a list of possible help topics
      "quit" to exit
localhost:16379> help get

  GET key
  summary: Get the value of a key
  since: 1.0.0
  group: string

localhost:16379> get 'x'
Error: Connection reset by peer
localhost:16379> set 'x' 12
Error: Connection reset by peer

From guest:

vagrant:~$ redis-cli -v
redis-cli 2.8.4
vagrant:~$ redis-cli
127.0.0.1:6379> set 'x' 12
OK
127.0.0.1:6379> get x
"12"

asked Jul 6, 2015 at 9:29

The solution is here : check your /etc/redis/redis.conf, and make sure to change the default

bind 127.0.0.1

bind 0.0.0.0

Then restart your service service redis-server restart

You can then now check that redis is listening on non-local interface with

answered Jul 6, 2015 at 9:37

zubazuba

2,3334 gold badges32 silver badges55 bronze badges

Источник

We have master slave setup with redis servers and after some time the master server began to refuse connections with

Error: Connection reset by peer

Looking in the redis server’s log in “/var/log/redis/redis-server.log” (Ubuntu way):

redis-server.log-13447:M 17 Jan 15:28:58.719 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:28:58.729 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:28:58.779 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:28:59.723 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:28:59.731 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:28:59.782 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:29:00.725 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:29:00.732 # Error registering fd event for the new client: Numerical result out of range (fd=24099)
redis-server.log-13447:M 17 Jan 15:29:00.784 # Error registering fd event for the new client: Numerical result out of range (fd=24099)

It looked like there are no more File descriptors available to the process of redis server, but way?
Here is why:

srv-redis1 # lsof -n|grep redis|grep FIFO|wc -l
96264
srv-redis1 # netstat -anp      
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      13447/redis-server
.....
.....
.....
redis-ser 13447            redis   51w     FIFO               0,10        0t0  403873809 pipe
redis-ser 13447            redis   52r     FIFO               0,10        0t0  403866724 pipe
redis-ser 13447            redis   53w     FIFO               0,10        0t0  403866724 pipe
redis-ser 13447            redis   54r     FIFO               0,10        0t0  403868523 pipe
redis-ser 13447            redis   55w     FIFO               0,10        0t0  403868523 pipe
redis-ser 13447            redis   56w     FIFO               0,10        0t0  403870163 pipe
......
......

Almost 100 000 FIFO pipes?

The only solution is just to restart the server

and probably it is a good idea to upgrade to the latest version!

srv-redis1 redis # systemctl restart redis-server.service

Our version was the latest of the branch 4.x (4.0.11-1chl1~xenial1) at the moment!

Local and remote connections to the master were impossible. The interesting part is that our software on all servers had been already connected and it was OK, but a restarted server could not connect to the master server any more! Here is what you might receive:

srv ~ # redis-cli -h 10.10.10.10 -p 6379
10.10.10.10:6379> INFO
Error: Connection reset by peer
10.10.10.10:6379>

and even on local host in the master:

srv-redis1 redis # redis-cli
127.0.0.1:6379> INFO
Error: Server closed the connection
127.0.0.1:6379> INFO
Error: Connection reset by peer
127.0.0.1:6379>

Increase the “Number of File Descriptors”

You can increase the file descriptors for the redis server in the systemd service file “/lib/systemd/system/redis-server.service”:

.....
LimitNOFILE=100000
.....

But you probably will get the problem again, it is most probably a bug very similar to this one: https://github.com/antirez/redis/issues/2857

Источник

Overview

This topic describes why Redis connection problems occur and how to solve the problems.

Problem Classification

To troubleshoot abnormal connections to a Redis instance, check the following items:

Connection Between the Redis Instance and the ECS
Public Access (Redis 3.0 Only)
Password
Instance Configuration
Client Connections
Bandwidth
Redis Performance

Connection Between the Redis Instance and the ECS

The ECS where the client is located must be in the same VPC as the Redis instance and be able to communicate with the Redis instance.

For a Redis 3.0 instance, check the security group rules of the instance and the ECS.
Correctly configure security group rules for the ECS and the Redis instance to allow the Redis instance to be accessed. For details, see How Do I Configure a Security Group?
For a DCS Redis 4.0 or 5.0 instance, check the whitelist of the instance.
If the instance has a whitelist, ensure that the client IP address is included in the whitelist. Otherwise, the connection will fail. For details, see Managing IP Address Whitelist. If the client IP address changes, add the new IP address to the whitelist.
Check the regions of the Redis instance and the ECS.
If the Redis instance and the ECS are not in the same region, create another Redis instance in the same region as the ECS and migrate data from the old instance to the new instance by referring to Data Migration Guide.
Check the VPCs of the Redis instance and the ECS.
Different VPCs cannot communicate with each other. An ECS cannot access a Redis instance if they are in different VPCs. You can establish VPC peering connections to allow the ECS to access the Redis instance across VPCs.

For more information on how to create and use VPC peering connections, see VPC Peering Connection.

Public Access (Redis 3.0 Only)

Before accessing a Redis instance through a public network, ensure that the instance supports public access. For details, see the public access explanation.

Symptom: «Error: Connection reset by peer» is displayed or a message is displayed indicating that the remote host forcibly closes an existing connection.
- Possible cause 1: The security group is incorrectly configured.
  Solution: Correctly configure the Redis instance and access the instance by following the public access instructions.
- Possible cause 2: Check whether the VPC subnet where Redis resides is associated with a network ACL and whether the network ACL denies outbound traffic. If yes, remove the ACL restriction.
- Possible cause 3: SSL encryption has been enabled, but Stunnel is not configured during connection. Instead, the IP address displayed on the console was used for connection.
  Solution: When enabling SSL encryption, install and configure the Stunnel client. For details, see Enabling SSL Encryption. In the command for connecting to the Redis instance, the address must be set to the IP address and port number of the Stunnel client. Do not use the public access address and port displayed on the console.
Symptom: Public access has been automatically disabled.
Cause: The EIP bound to the DCS Redis instance is unbound. As a result, public access is automatically disabled.

Solution: Enable public access for the instance and bind an EIP to the instance on the management console. Then, try again.

Password

If the instance password is incorrect, the port can still be accessed but the authentication will fail. If you forget the password, you can reset the password. For details, see Resetting Instance Passwords.

Instance Configuration

If a connection to Redis is rejected, log in to the DCS console, go to the instance details page, and modify the maxclients parameter. For details, see Modifying Configuration Parameters.

Client Connections

The connection fails when you use redis-cli to connect to a Redis Cluster instance.
Solution: Check whether -c is added to the connection command. Ensure that the correct connection command is used when connecting to the cluster nodes.
- Run the following command to connect to a Redis Cluster instance:
  ./redis-cli -h {dcs_instance_address} -p 6379 -a {password} -c
- Run the following command to connect to a single-node, master/standby, or Proxy Cluster instance:
  ./redis-cli -h {dcs_instance_address} -p 6379 -a {password}
For details, see Access Using redis-cli.
Error «Read timed out» or «Could not get a resource from the pool» occurs.
Solution:
- Check if the KEYS command has been used. This command consumes a lot of resources and can easily block Redis. Instead, use the SCAN command and do not execute the command frequently.
- Check if the DCS instance is Redis 3.0. Redis 3.0 uses SATA disks. During AOF persistence, the disk performance may occasionally deteriorate and cause a connection failure. In this case, disable AOF persistence if data persistence is not required. Alternatively, you can use a DCS Redis 4.0 instance or later because they use SSD disks that offer higher performance.
Error «unexpected end of stream» occurs and causes service exceptions.
Solution:
- Optimize the Jedis connection pool by referring to Recommended Jedis Parameter Settings.
- Check whether there are many big keys. For details, see How Do I Avoid Big Keys and Hot Keys?
The connection is interrupted.
Solution:
- Modify the application timeout duration.
- Optimize the service to avoid slow queries.
- Replace the KEYS command with the SCAN command.
If an error occurs when you use the Jedis connection pool, see Troubleshooting a Jedis Connection Pool Error.

Bandwidth

If the bandwidth reaches the upper limit of the corresponding instance specifications, Redis connections may time out.

You can view the Flow Control Times metric to check whether the bandwidth has reached the upper limit.

Then, check whether the instance has big keys and hot keys. If a single key is too large or overloaded, operations on the key may occupy too many bandwidth resources. For details about big keys and hot keys, see Analyzing Big Keys and Hot Keys.

Redis Performance

Connections to an instance may become slow or time out if the CPU usage spikes due to resource-consuming commands such as KEYS, or too much memory is used because the expiration time is not set for the instance or expired keys remain in the memory. In these cases, do as follows:

Use the SCAN command instead of the KEYS command, or disable the KEYS command.
Check the monitoring data and configure alarm rules. For details, see Setting Alarm Rules for Critical Metrics.
For example, you can view the Memory Usage and Used Memory metrics to keep track of the instance memory usage, and view the Connected Clients metric to determine whether the instance connections limit has been reached.
Check whether the instance has big keys and hot keys.
For details about the operations of big key and hot key analysis, see Analyzing Big Keys and Hot Keys.

Источник

I am deploying my Django backend on Heroku and I have used celery and Redis as a broker.

But I am getting these errors on the worker log:

2021-08-23T17:43:05.148602+00:00 app[worker.1]: [2021-08-23 17:43:05,148: ERROR/Beat] beat: Connection error: Error while reading from socket: (104, 'Connection reset by peer'). Trying again in 32.0 seconds...
2021-08-23T17:43:35.862615+00:00 app[worker.1]: [2021-08-23 17:43:35,862: ERROR/MainProcess] consumer: Cannot connect to redis://:**@ec2-52-22-18-101.compute-1.amazonaws.com:30320//: Error while reading from socket: (104, 'Connection reset by peer').
2021-08-23T17:43:35.862624+00:00 app[worker.1]: Trying again in 32.00 seconds... (16/100)

setting.py:

CELERY_BROKER_URL = 'redis://:*******.compute-1.amazonaws.com:port'
CELERY_RESULT_BACKEND = 'redis://:***********.compute-1.amazonaws.com:port'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SELERLIZER = 'json'
CELERY_TIMEZONE = 'Asia/Kolkata'
BROKER_POOL_LIMIT = None

celery.py :

from __future__ import absolute_import
from datetime import timezone
import os
from celery import Celery
from django.conf import settings
from celery.schedules import crontab, schedule

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'NewsApi.settings')
app = Celery('NewsApi')

app.conf.beat_schedule = {
    'add-every-1-hour': {
        'task': 'Api.tasks.task_news_update',
        'schedule': crontab(minute=0, hour='*/1'),
    }
}

app.conf.update(timezone = 'Asia/Kolkata')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)


@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))

Источник

Comments

Increase the “Number of File Descriptors”

Overview

Problem Classification

Connection Between the Redis Instance and the ECS

Public Access (Redis 3.0 Only)

Password

Instance Configuration

Client Connections

Bandwidth

Redis Performance

Читайте также: