![[Pasted image 20240130103300.png]] > btw, i suspect that the coordconv at the first layer might not be necessary (coord0) > at the first layer the CNN just does low level feature dedection (detecting edges and so on) > the coordconv are only helpful for higher level analysis deeper in the model > at least that's my theory > also, do i see correctly that after the first layer, there's 1 maxpool that makes it 4x smaller? from 2048x1024 to 512x256? > or does the first layer use stride 2, and the maxpool only makes it 2x smaller? > overall the architecture looks good :+1: > - Comments [[Floris]]