We have a client with a multi-data center installation of Cassandra.  While, I was in another one of our data centers working on a client cluster and attempting to run a rebuild and got a stream error.

Cassandra server is running package

cassandra30-3.0.9-1.noarch

I ran the following command;

nodetool -u <username> -pw <password> rebuild

After a little while I got the following error and was trying to determine what changed. These machines were working just fine and nothing configuration wise should have changed. The usual line that everyone says, right?

java.lang.RuntimeException: Error while rebuilding node: Stream failed
        at org.apache.cassandra.service.StorageService.rebuild(StorageService.java:1107)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
        at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
        at com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
        at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
        at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
        at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
        at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
        at sun.rmi.transport.Transport$1.run(Transport.java:200)
        at sun.rmi.transport.Transport$1.run(Transport.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

When I ran nodetool status shows:

Datacenter: LAS01
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns    Host ID                               Rack
?N  xxx.xxx.76.19   24.93 GB   256          ?       826b90b3-7d30-40c9-b92a-5b26372e7698  R1
DN  xxx.xxx.76.18   22.11 GB   256          ?       85849d96-c3f9-496f-a31b-96633655fc94  R1
UN  xxx.xxx.76.20   16.96 GB   256          ?       c3cbe83d-5d8b-4e20-9e21-5c76f0723aa2  R1
Datacenter: LAX03
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns    Host ID                               Rack
UN  xxx.xxx.107.22  16.97 GB   256          ?       9c09ac3b-1540-475a-8e41-a159019b4d6a  G4
UN  xxx.xxx.107.23  14.16 GB   256          ?       3a4b1145-6baa-49b0-9675-2e262d3deda0  G4
UN  xxx.xxx.107.21  17.58 GB   256          ?       80a87349-ab97-4513-9985-20ebb7b96cee  G4

The Cassandra cluster is in Los Angeles and Las Vegas data centers.  You’ll notice that the first node in Las Vegas data center is giving a “?N” and another is showing “DN” that it’s kinda a problem.  After doing some investigations it looks like the servers were rebooted during a power upgrade in the cabinet.

The issue was iptables and apf firewall got reactivated on the server and was blocked traffic.  They were fine when connecting remotely, but on the local network they were blocking messages from each of the Cassandra servers.  After disabling the firewalls the issue cleared up.

Running nodetool status  after turning off firewall, everything looks normal:

Datacenter: LAS01
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns    Host ID                               Rack
UN  xxx.xxx.76.19   24.93 GB   256          ?       826b90b3-7d30-40c9-b92a-5b26372e7698  R1
UN  xxx.xxx.76.18   22.11 GB   256          ?       85849d96-c3f9-496f-a31b-96633655fc94  R1
UN  xxx.xxx.76.20   16.96 GB   256          ?       c3cbe83d-5d8b-4e20-9e21-5c76f0723aa2  R1
Datacenter: LAX03
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns    Host ID                               Rack
UN  xxx.xxx.107.22  16.97 GB   256          ?       9c09ac3b-1540-475a-8e41-a159019b4d6a  G4
UN  xxx.xxx.107.23  14.16 GB   256          ?       3a4b1145-6baa-49b0-9675-2e262d3deda0  G4
UN  xxx.xxx.107.21  17.58 GB   256          ?       80a87349-ab97-4513-9985-20ebb7b96cee  G4

If you have more questions about Cassandra hosting.