Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

CAN 2.0 Problem for over 2 years!

Status
Not open for further replies.

umery2k75

Advanced Member level 1
Advanced Member level 1
Joined
Apr 19, 2006
Messages
434
Helped
42
Reputation
82
Reaction score
16
Trophy points
1,298
Location
Pakistan
Activity points
5,748
There are 10 absorption chiller and 1 controller(D.D.C=Direct digital controller) to control them. The system has been installed and tested of it's functionality two years ago.The communication method is CAN 2.0. The problem is that once, the DDC is programmed to store the 10 nodes address, it must not forget the address, until it's battery drains out. One by one, the DDC appears to forget the chiller's address or forget the total 10 nodes addresses. The chiller goes OFFLINE and cooling stops. These days, they have bypassed the DDC and chillers are working manually.
Recently, I came to know about the problem. I was fascinated and willing to discover the reason behind it. The correspondence with the manufacturer has gone for about 1.5 years in keeping in view the same problem. The manufacturer had sent the new cable, sent the new controller and everything has been replaced, still the problem remains. The manufacturer is not willing to sent their engineers, due to security reason of the country.When I first went over their, I saw this.



The middle box is the controller and the two upper black units are the inverters DANFOSS VLT2800 and the left panel is a digital controller with some relays. My first attempt was that I asked them to move this controller away from the inverter. These two inverters has 4.5KHz modulating frequency, I was afraid that this could be causing the RAM of the controller to flash out. Sometimes all the address of the controller goes off, or they goes off one by one.I was not trying to ZOOM IN, to solve the problem, just trying to solve by means of common practices. I had moved this unit away from everything and even put the metal sheet everywhere, so interference could not affect it. The picture of the controller are





The 160 pins is the controller. The IC in which AMD is written is the flash. The upper two are the RAM. I have seen it's datasheet
Controller=MC68376BGMAB20
Flash=AM29F800BB
2 RAM=R1LP0408CSP-5S1

I don't have the picture of absorbtion chiller unit electronic controller.It looks like this



and the unit looks like this

82_1270185565.jpg



It has Motorola Controller. The number is DSP56F807PY80

Summarizing, the above details. The problem is their for over 2 years. The person who do installation has done installation in many places and he's an experienced person. This house, only has problem. Every electronic equipment has been replaced, except the chiller unit electronic boards. My guess is that either, the DDC forget the node address or chiller unit forget it's own address.In both condition, the chiller goes offline. There's no symmetry. Any 1 unit out of 10 units can go offline. It might occur, that the DDC remember the address for over 1 week. Sometimes after a few hours, the nodes address got lost. Sometime after 2 days, the DDC forget the address. Maximum period is of 1 week. It's like one by one, or altogether the units goes off in a period of time.Nobody can guess, which units will go offline and at what time. I have seen JTAG ports on the DDC and absorption chiller unit. I have wrote to the manufacturer, but still there's no response. I'm interested to see where the addresses of the chiller unit are stored in RAM. I will monitor that data register. There are also CAN Bus tester, analyser in the market. Are they helpful? I also want to make sure, the terminating point resistance must be 120ohm.
Please suggest what to do.
 

Today I phone called to Italy, they said it's a strange problem. This is not possible. There must be something wrong with the connection. I checked all the connection in view of National Electrical Code(NEC). All work is good. How can I check the Signal to Noise Ratio of the CAN Bus, without purchasing the CAN Bus tester at this time, or I have to purchase, what is the acceptable level for different baud rate? I have asked for the debugging device that can be connect to JTAG Port of their hardware. Is this possible that I make a circuit like this. Not this circuit, but a circuit that monitors the CAN bus and can detect how many ID's are there. What can be done more, by putting a SPY device, that just listens.

 

I don't exactly understand what you intend to see with a CAN tester. The problem seems to be an internal configuration problem
of the controller, not a CAN communication problem. Or did I miss something?

When you say, the address got lost, I assume that you see a changed configuration and that operation is resumed,
after restoring the configuration?
 

Yes, it appears to be the hardware problem. The controller forgets the address one by one. A friend of mine, did the work on CAN Bus. He said, packet collision is not possible in CAN, where as in Ethernet packet collision occur, data get's lost. So in CAN Bus, each device sends the signal with the priority, so as to avoid packet collision. The device which has the highest priority sends the data, where as other wait. When a specific device sent the high priority signal and do not give chance to other devices, that specific device gets shutdown from the system, so as to avoid traffic jam. I don't know if all devices are fighting on the CAN to send the highest priority signal and getting shut down one by one.
 

Hi umery,
Only a dumb question; did you checked the supply lines pls?
Seems for me , their are two possible reasons for data losses: 1, Voltage spikes(or disturbances, as bridge/switcher problems)
2, Your environment is full wast of EMI/ or ionized particles...
Assumed, that your components are absolutely failure less!
Can you tell pls; why arent changed the chiller unit electronic boards!?
Is it not possible for you to take your wrong cards into a "good system", or even from a good knowed to borrow some cards?
The "old installer" has maybe some spares?
Can you not apply extra transient absorber over your supplies & extra mains-EMC/EMI filterings?
Do you have some transformer station or mobile phone base stations in the neighborhood or welding equipments/thyristor high power regulators/lift motors...?
K.
 

1)This is the AC 24V signal, which was coming into the controller. I didn't take the FFT of this.



2)No, environment appeared fine to me. I moved the controller away from everything.I have borrowed one DDC controller, it's in my home. Other is installed in their home. I don't have spare for the chiller. If I apply maths. There is very less probability that 10 chiller units have faulty electronics board at the existing time.How come 10 similar boards could be faulty at the same time.
There is no such things as a mobile phone station, or any other thing suspicious. However there's one dangerous things over their . Their big DOG. It barks like hell, every times it sees me. I have to go nearer from him to go the chiller room :D

2, Your environment is full wast of EMI/ or ionized particles...
How can I make sure environment is good, what sort of equipment I must use?

I am interested in buying this product, the man use the device for detecting electric field. It gives buzzing sound for 50/60Hz. I'm soon going to place an order.


Those Italians said, I have to wait for the reply till 6th of this April, they have national holidays. I think there is a bug in the hardware, I have seen many Taiwan engineers, chineese engineers coming to the sole-distributor of UPS,Inverters, Chopper supplies. Many times they have bug in their product. Which has been marketed already. The come and take feedback, improve their design and code.
 

Yea, we have "A Cristian" holiday yet (began on friday & with monday is to end: Eastern! :)...
Environment, EMI: I did meen another disturbances, these "field detector" will it not sense, their is only for main frequencies.
I think, that you can have some transients/EMI disturbances in/from your power grid, this can be from a switched high power load/welding or industrial RF generator (plastic molding/microwave sterilizing) etc...
The simplest EMI/EMC check instrument is a old style SW & UHF pocket radio :)_ the betetr are very expensive, Field antenna applyed spectrum analyzers & high specialyzed equipments...
But if you have some systems their are good on function; is it inpossible to borrow their card or put your "wrong" one in their good functioning system?
You see, I can believe a bug in sold systems, but you have some functioning cards & some semi functioning, I can not big believe on designer bug in that case... Or is the system on the extremes/limits, that some components are OK, but other can not make the function (enough).
With these oscilloscope resolution is a transient not to check, it can be (only)10-100usec wide, your sampling is slower, but eventual i.e. 20V spike on the +3V3/5V logic supply _the behavioral characteristic is to guess from your system...
Similar problem is if you have short time power missings (drop outs)_ you must the line voltages monitor for some hours over these both (eventual) failure sources: not so very simple to realize-I think.
Sorry, I dont have at the moment some better idea :-(
Good luck in trouble shooting!
K.

Added after 8 minutes:

A newer idea;
If on your board (on some chip) eeproms are: is it possible to erase their "soft info" maybe in a mine searcher equipment at aeroports? (strong X-ray/ high power UltaWideband Microwaves...)
I dont have some experience, but is possible that some cards are others packed for transport & are stressed wild X-rays or "normal" static discharges...

Added after 1 hours 51 minutes:

One more idea: CAN line must be impedance defined; are terminating resistors on the data lines, are the pulses correct/reflection free?
 

1)
There are two ways to teminate a CAN network.
i) 120Ω resistors on bus ends or
ii) terminator resistors for each CAN node as defined by the manufacturer.

They shouldn't be mixed. You can check for this. But if communication goes well for 1 week, that is not likely to be the case as reflections quickly build up with improper termination and no communication will be done.

2)
IDs are remembered by units/controller unless the battery drains out, that means you are not using ROM. The design is only RAM based.

3)
How do you determine that Controller or the units forget the IDs? I think it can be decided as per your description. More likely it will be the controller causing the problem.

4)
Don't rule out the software bug as well.
4a)
Do all the units trasmit message periodically to the controller or does controller request periodic messages?
4b)
Are there intermitted messages? Are there any error indication messages from the units?

It is very well possible that upon receiving a certain message (let's forget about Units, it is the message ID that is important), the controller executes a sequence of commands that corrupts its memory (a wrongly used pointer), causing it to loose the IDs to send/receive.

5)
You can get bare bone, CAN bus monitors for around $250. Building a monitor yourself won't be that costly in terms of HW, but if you don't have much experience with firmware, the time spent to making that analyzer will hugely offset the cost. I'll recommend getting one.
 

are you check signal cabel are sheldel and control panel conected good with earth cabel
 

I'm going to check it within few days. I will post pictures and video of it and mail the same document to the respective company.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top