Saturday, August 5, 2017

First impressions of VEGA on LN2

This Friday I ran some very quick tests with VEGA FE on LN2. I don't have many screen shots but I do have some intresting information.

Does it work under LN2 at all?

Yes it does. All the way to -185C. No issues what so ever. No cold slow no cold bug or boot bug and no black screens under load either. I did only test at stock voltage and the black screen bug tends to show up at high core voltages but for now it looks like smooth sailing even at full pot.

Does it scale with temperature?

Kinda. My card on stock volts on water does 1680/1100 at best for 3Dmark timespy while bouncing of the power limit. Under LN2 at stock Vcore I was perfectly stable at 1800/1100. I was in a rush due to the card not being properly insulated so I didn't try 1850/1100 but 1900/1100 crashed. 2000/1100 would pass Timespy but the score was lower than at stock clock so it look like VEGA can pull that same stunt like Pascal where it seems to run at very high clocks while under performing horrifically. I'm sure extra Vcore will solve this as evidently I didn't have enough Vcore to run 1900 properly. Another intresting side effect of the LN2 is that VEGA's power draw fell of a cliff. Like 100W less than air cooled on +50% power limit. So the GPU core evidently is very happy with LN2 and could probably go well in excess of 2GHz with more voltage(I did all my testing at stock which is 1.2V).

The HBM2 on the other hand seems to have some major issues. I could get GPU-z's render test to run just fine at 1230MHz HBM2 clock however if I tried to run any real workloads like 3Dmark Timespy it would crash even 1MHz above 1100MHz. I can think of a number of things that could be causing this and all need more testing. First of all the HBM might just need more voltage on either VDD or VPP to sustain load. There might be some kind of issue with the memory timings just being too tight for any clock above 1100Mhz. It could also have been a thermal problem. Pulling the card down I heard something that sounded very much like the thermal paste failing however GPU core temperatures were still bellow 0 and the LN2 pot side temperature probe was responding under load as it should. However the HBM2 stacks don't have any accessible temperature readings and aren't exactly one with the GPU core silicon so the HBM stacks could have lost contact while the GPU core was doing just fine. So at low loads the HBM would stay cool but under full load it would warm up and crash. Either way this needs more testing.

The display outputs freeze over pretty quick as the only things between them and the GPU core are the VPP, VDDCI and display drive VRMs. None of which put out any significant amount of heat so the cold from the GPU core gets the display outputs pretty quick.
The back of the card. I used the mounting bracket from the Raijintek Morpheus II cooler to get a more secure mount for the Der8auer Raptor 4 LN2 pot.
The red wire is hooked up to the Vcore plain so I could check voltages with a DMM. I also added some 2.5V SMD polymer caps to both Vcore and the VDD rail. At ambient those did nothing for overclocking capabilities and I haven't done a before and after test on LN2 either. So I have no idea how much or if they are even helping.
This pic of the GPU frozen just looks cool. You can kinda see the infill around the HBM2 dies. I'm really glad that it's been added as it should make it much much more difficult to damage the HBM interposer when replacing cooling systems on the GPU core.

And here we can see something I found pretty interesting. For some reason and small piece of the thermal paste stayed on the GPU core while all the rest of the paste stayed on the LN2 pot.
With the card on LN2 the VRMs would all freeze over at idle as they don't produce and appreciable heat. Under load the ice on the Vcore VRM would very quickly melt. This might cause some major water problems for extended sessions as keeping the VRM from cycling between sub 0 and positive is pretty much impossible.

Overall I must say I'm excited to try run VEGA on LN2 seriously once we get some proper drivers for the cards and I get more LN2.

The only score I saved from the session: