Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 3005

MKL DCT using fortran: scaling prefactors and complex-type arrays

$
0
0

Hi,

I'm currently using the MKL to calculate some DCT on complex-type fortran arrays. My fortran module has a first initialization routine that looks like

   subroutine initialize(this,n, ni, nd)
     class(costf_odd_t) :: this
     !-- Input variables
     integer, intent(in) :: n
     integer, intent(in) :: ni
     integer, intent(in) :: nd
     !-- Local variables:
     integer :: stat
     real(cp) :: fac
     real(cp) :: work(n)
     this%nRad = n
     allocate(this%i_costf_init(128))
     allocate(this%d_costf_init(1:5*(n-1)/2+2))
     call d_init_trig_transform(n-1,MKL_COSINE_TRANSFORM,this%i_costf_init,this%d_costf_init,stat)
     call d_commit_trig_transform(work,this%r2c_handle,this%i_costf_init,this%d_costf_init,stat)
     !fac = sqrt(half*real(this%nRad-1,cp))
     !stat = DftiSetValue(this%r2c_handle, DFTI_FORWARD_SCALE, fac)
     !stat = DftiCommitDescriptor(this%r2c_handle)
  end subroutine initialize

The actual DCT is then computed later on as follows

   subroutine costf1_complex(this,f,n_f_max,n_f_start,n_f_stop)
     class(costf_odd_t) :: this
     !-- Input variables:
     integer,  intent(in) :: n_f_start,n_f_stop ! columns to be transformed
     integer,  intent(in) :: n_f_max
     complex(cp), intent(inout) :: f(n_f_max,this%nRad)
     !-- Local variables:
     real(cp) :: work_real(this%nRad)
     real(cp) :: work_imag(this%nRad)
     integer :: stat, n_f

     do n_f=n_f_start,n_f_stop
        work_real(:) = real(f(n_f,:))
        work_imag(:) = aimag(f(n_f,:))
        call d_forward_trig_transform(work_real,this%r2c_handle,this%i_costf_init,this%d_costf_init,stat)
        call d_forward_trig_transform(work_imag,this%r2c_handle,this%i_costf_init,this%d_costf_init,stat)
        f(n_f,:)=sqrt(half*real(this%nRad-1,cp))*cmplx(work_real,work_imag,kind=cp)
     end do
  end subroutine costf1_complex

It basically does what I want but I have several performance limitations since: (i) there is a pre-factor multiplication which is computed for each DCT, (ii) there is a memory copy due to the unsupported complex-type input arrays in the MKL trig transforms. 

I tried fixing the first issue using the 3 last commented lines in the initialize subroutine above, but it didn't work, any idea here? Concerning the second issue, is there a possible way to avoid the memory allocation of the work arrays in the costf1_complex subroutine (maybe using the C-type pointers)?

Thanks!

 


Viewing all articles
Browse latest Browse all 3005

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>